KR20060049980A

KR20060049980A - Apparatus for encoding and decoding multichannel audio signal and method thereof

Info

Publication number: KR20060049980A
Application number: KR1020050061655A
Authority: KR
Inventors: 백승권; 서정일; 박기윤; 이병화; 강경옥; 홍진우; 한민수
Original assignee: 한국전자통신연구원
Priority date: 2004-07-09
Filing date: 2005-07-08
Publication date: 2006-05-19
Also published as: KR100745688B1

Abstract

본 발명은 다채널 오디오 신호를 부호화/복호화 하는 방법 및 장치에 관한 것으로, 보다 상세하게는, 채널 상관성에 기반하여 다채널 오디오 신호를 부호화 및 복호화하는 다채널 파라메트릭 부호화/복호화 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for encoding / decoding a multichannel audio signal, and more particularly, to a method and apparatus for multichannel parametric encoding / decoding for encoding and decoding a multichannel audio signal based on channel correlation. will be.

본 발명에 따른 다채널 오디오 신호 부호화 장치는 다채널 오디오 신호를 주파수 영역에서 분할하는 신호 분할 수단; 상기 분할된 신호를 단일채널로 다운믹스(dowm-mix)하여 부호화하는 다운믹스 수단; 및 상기 분할된 신호에 대해 프레임 단위의 채널 상관도를 측정하고, 상기 프레임 단위의 채널 상관도에 따라 상기 분할된 신호를 재분할하는 파라메터 분석 수단을 포함하며, 상기 파라메터 분석 수단은, 상기 프레임 단위 채널 상관도가 제1 소정의 값 이하인 경우 복호화를 위한 큐 파라메터로서, 상기 재분할된 신호의 서브밴드에 대한 채널 ID와 서브밴드 단위 채널 상관도를 추출하여 부호화하는 것을 특징으로 한다.An apparatus for encoding a multichannel audio signal according to the present invention includes: signal dividing means for dividing a multichannel audio signal in a frequency domain; Downmixing means for downmixing the divided signals into a single channel and encoding the divided signals; And parameter analyzing means for measuring a channel correlation degree in units of frames with respect to the divided signals, and re-dividing the divided signals according to the channel correlation degree in units of frames. If the correlation is less than or equal to the first predetermined value, it is a queue parameter for decoding, and the channel ID and the subband unit channel correlation for the subband of the subdivided signal are extracted and encoded.

본 발명은 다채널 오디오 신호의 복호화에 필요한 큐 파라메터를 최적화함으로써 적은 대역폭이 할당되는 환경에서도 단일채널 오디오 신호로부터 다채널 오디오 신호를 복원하여 재생할 수 있는 효과를 제공한다. The present invention provides an effect of restoring and reproducing a multichannel audio signal from a single channel audio signal even in an environment where a small bandwidth is allocated by optimizing a cue parameter required for decoding a multichannel audio signal.

오디오 코딩, 큐 파라메터, 다채널, 채널 상관성 Audio Coding, Cue Parameters, Multichannel, Channel Correlation

Description

Apparatus for encoding and decoding multichannel audio signal and method

도 1은 본 발명에 따른 다채널 오디오 신호 부/복호화 장치의 블럭 구성도;1 is a block diagram of a multi-channel audio signal encoding / decoding apparatus according to the present invention;

도 2는 도 1의 다채널 파라메트릭 부호화부의 세부 블럭 구성도;FIG. 2 is a detailed block diagram of a multi-channel parametric encoder of FIG. 1; FIG.

도 3은 도 2의 파라메터 분석부의 세부 블럭 구성도;3 is a detailed block diagram of a parameter analyzer of FIG. 2;

도 4는 도 1의 다채널 파라메트릭 복호화부의 세부 블럭 구성도;4 is a detailed block diagram of a multi-channel parametric decoder of FIG. 1;

도 5는 도 4의 파라메터 기반 다채널 생성부의 세부 블럭 구성도;FIG. 5 is a detailed block diagram of a parameter-based multichannel generator of FIG. 4; FIG.

도 6은 본 발명에 따른 다채널 오디오 신호를 부호화하는 과정을 도시한 흐름도; 및6 is a flowchart illustrating a process of encoding a multichannel audio signal according to the present invention; And

도 7은 본 발명에 따른 다채널 오디오 신호를 복호화하는 과정을 도시한 흐름도이다.7 is a flowchart illustrating a process of decoding a multichannel audio signal according to the present invention.

광대역 오디오 신호의 부호화 기술은 최근 수년 동안 놀라운 성장을 거듭하고 있다. 이러한 부호화 기술의 발전에 따라, 가정에서나 이동중인 차량 안에서 다채널 오디오를 재생할 수 있는 다채널 오디오 시스템이 상용화 되고 있다. 그러나 다채널 오디오 신호의 전송과정에서 소요되는 대역폭의 문제로 인하여 아직까지 보다 폭넓은 상용화 단계에는 이르지 못하고 있다.The technology of encoding wideband audio signals has grown tremendously in recent years. With the development of such coding technology, a multichannel audio system capable of reproducing multichannel audio at home or in a moving vehicle has been commercialized. However, due to the bandwidth problem in the transmission of the multi-channel audio signal, it has not yet reached a wider commercialization stage.

최근들어 공간정보 기반 오디오 코딩(SAC: Spatial audio coding) 기술의 소개는 한정된 대역폭에서 효과적으로 다채널 오디오 신호를 전송 및 표현 가능하도록 하고 있다. 특히 바이노럴 큐 코딩(BCC: Binaural cue coding) 방식의 SAC는 적은 대역폭에서도 고품질의 다채널 오디오 신호를 재생할 수 있는 알고리즘으로서 최근 크게 주목 받고 있는 기술이다. Recently, the introduction of Spatial Audio Coding (SAC) technology enables efficient transmission and presentation of multi-channel audio signals in a limited bandwidth. In particular, BAC (Binaural cue coding) SAC is a technique that has attracted much attention recently as an algorithm capable of reproducing high-quality multichannel audio signals with low bandwidth.

BCC의 기본 개념은, 사람이 느낄 수 있는 음향에 대한 공간감이 두 귀로 인하여 발생한다는 사실에 근거하여 이를 표현할 수 있는 큐 파라메터를 이용하여 오디오 신호를 생성하는 것이다. BCC는 사람이 주로 인지하는 큐 파라메터로서 소리 크기의 차와 소리 지연의 차를 이용하여 다운믹스된 단일 채널 오디오 신호로부터 다채널 오디오 신호를 복원한다. BCC에 관한 보다 자세한 내용은 논문 "Binaural Cue Coding-partII: schemes and application"(IEEE Trans. on speech and audio Proc. Vol. 11. No.6. Nov. 2003)에 기재되어 있다. The basic concept of BCC is to generate an audio signal using cue parameters that can be expressed based on the fact that two ears generate a sense of space with which humans can feel. BCC is a human-recognized cue parameter that recovers multichannel audio signals from downmixed single channel audio signals using differences in loudness and sound delay. More details on BCC are described in the article "Binaural Cue Coding-part II: schemes and application" (IEEE Trans.on speech and audio Proc. Vol. 11. No. 6. Nov. 2003).

그러나, BCC 방식은 추출된 큐 파라메터를 전송하기 위한 대역폭도 채널 수 가 증가함에 따라 선형적으로 증가한다는 문제점을 가지고 있다. 예를들어 스테레오 신호를 표현하기 위해 필요한 큐 파라메터가 4 kbps 소요된다고 가정할 경우, 5 채널에 대하여 20 kbps 소요됨으로써 상당량의 대역폭을 차지하게 된다. However, the BCC method has a problem in that the bandwidth for transmitting the extracted queue parameter also increases linearly as the number of channels increases. For example, assuming that the cue parameter required to represent a stereo signal takes 4 kbps, it takes 20 kbps for 5 channels, which takes up a considerable amount of bandwidth.

본 발명은 전술한 문제점을 해결하기 위한 것으로, 채널 상관도에 따라 최적의 큐 파라메터를 추출함으로써, 작은량의 큐 파라메터를 이용하며 BCC와 동일한 오디오 재생 품질을 제공하는 것을 그 목적으로 한다.SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems, and an object thereof is to provide an audio reproduction quality similar to that of a BCC by using a small amount of cue parameters by extracting an optimal cue parameter according to channel correlation.

본 발명이 속한 기술 분야에서 통상의 지식을 가진 자는 본 명세서의 도면, 발명의 상세한 설명 및 특허청구범위로부터 본 발명의 다른 목적 및 장점을 쉽게 인식할 수 있을 것이다.Those skilled in the art to which the present invention pertains will readily recognize other objects and advantages of the present invention from the drawings, the description of the invention, and the claims.

상기 파라메터 분석 수단은 상기 프레임 단위 채널 상관도가 감소함에 따라 증가하는 개수의 서브밴드로 상기 분할된 신호를 재분할한다. 상기 프레임 단위 채널 상관도는 상기 분할된 신호에 대한 서브밴드 단위의 채널 상관도로부터 획득된다. 상기 프레임 단위 채널 상관도는 고주파 대역의 서브밴드 단위 채널 상관도 보다 저주파 대역의 서브밴드 단위 채널 상관도에 의해 결정된다. 상기 재분할된 서브밴드 b에 대한 채널 ID는 재분할된 서브밴드 b의 전력이 최대가 되는 채널의 인덱스이다. The parameter analyzing means re-segments the divided signal into an increasing number of subbands as the frame-by-frame channel correlation decreases. The frame correlation is obtained from channel correlation in subbands for the divided signals. The frame unit channel correlation is determined by the subband unit channel correlation of the low frequency band rather than the subband unit channel correlation of the high frequency band. The channel ID for the subdivided subband b is an index of a channel at which the power of the subdivided subband b is maximized.

상기 파라메터 분석 수단은 상기 프레임 단위 채널 상관도가 상기 제1 소정의 값 이상인 경우 상기 큐 파라메터로서, 상기 재분할된 신호의 서브밴드별로 채널간 주파수 크기차(Inter-Channel Level Difference)와 서브밴드 단위 채널 상관도를 추출하여 부호화한다. 상기 파라메터 분석 수단은 상기 큐 파라메터로서, 상기 재분할된 신호의 서브밴드별로 채널간 시간 지연차(Inter-Channel Level Difference)를 부가적으로 추출한다.The parameter analyzing means is a cue parameter when the frame unit channel correlation is equal to or greater than the first predetermined value, and includes an inter-channel level difference and a subband unit channel for each subband of the subdivided signal. The correlation is extracted and encoded. The parameter analyzing means additionally extracts an inter-channel level difference for each subband of the subdivided signal as the cue parameter.

상기 파라메터 분석 수단은 상기 프레임 단위 채널 상관도가 상기 제1 소정의 값 이하인 경우, 상기 분할된 신호의 서브밴드 개수 보다 많은 개수로 상기 분할된 신호를 재분할하고, 상기 프레임 단위 채널 상관도가 상기 제1 소정의 값보다 큰 제2 소정의 값 이상인 경우, 상기 분할된 신호의 서브밴드 개수 보다 적은 개수로 상기 분할된 신호를 재분할하고, 상기 프레임 단위 채널 상관도가 상기 제1 소정의 값과 상기 제2 소정의 값 사이인 경우, 재분할을 하지 않는다.The parameter analyzing means may repartition the divided signal into a number greater than the number of subbands of the divided signal when the frame unit channel correlation is less than or equal to the first predetermined value, and the frame unit channel correlation may be determined by the method. When the first predetermined value is greater than or equal to a second predetermined value, the divided signal is re-divided into a number less than the number of subbands of the divided signal, and the channel unit channel correlation is equal to the first predetermined value and the first value. If it is between two predetermined values, no repartitioning is performed.

상기 파라메터 분석 수단은 상기 프레임 단위 채널 상관도가 상기 제1 소정 의 값과 상기 제2 소정의 값 사이인 경우 상기 큐 파라메터로서, 상기 분할된 신호의 서브밴드별로 채널간 주파수 크기차(Inter-Channel Level Difference)와 서브밴드 단위 채널 상관도를 추출하여 부호화한다.The parameter analyzing means is a cue parameter when the frame unit channel correlation is between the first predetermined value and the second predetermined value, and the frequency magnitude difference between channels for each subband of the divided signal (Inter-Channel). Level Difference) and subband unit channel correlation are extracted and encoded.

본 발명에 따른 다채널 오디오 신호 복호화 장치는 주파수 영역에서 분할된 다채널 오디오 신호를 단일채널로 다운믹스하여 부호화하고, 상기 분할된 다채널 오디오 신호에 대한 프레임 단위의 채널 상관도에 따라 상기 분할된 신호를 재분할함으로써 부호화된 오디오 신호를 다채널 오디오 신호로 복호화하는 다채널 오디오 신호 복호화 장치에 있어서, 상기 부호화된 단일채널 오디오 신호를 복호화하는 단일채널 복호화 수단; 및 상기 프레임 단위 채널 상관도에 따라, 상기 재분할시에 사용된 서브밴드 구간과 동일하게 상기 복호화된 단일채널 오디오 신호를 주파수 영역에서 분할하는 파라메터 기반 다채널 생성 수단을 포함하며, 상기 파라메터 기반 다채널 생성 수단은, 상기 프레임 단위 채널 상관도가 제1 소정의 값 이하인 경우 상기 재분할된 신호의 서브밴드에 대한 채널 ID와 서브밴드 단위 채널 상관도를 이용하여 상기 단일채널 오디오 신호로 복호화하는 것을 특징으로 한다. An apparatus for decoding a multichannel audio signal according to the present invention downmixes a multichannel audio signal divided in a frequency domain into a single channel, encodes the multichannel audio signal, and divides the multichannel audio signal according to a channel correlation degree in units of frames for the divided multichannel audio signal. 1. A multichannel audio signal decoding apparatus for decoding an encoded audio signal into a multichannel audio signal by re-dividing a signal, comprising: single channel decoding means for decoding the encoded single channel audio signal; And parameter-based multi-channel generating means for dividing the decoded single-channel audio signal in a frequency domain in the same manner as the sub-band interval used in the re-division according to the frame unit channel correlation. The generating means decodes the single channel audio signal using the channel ID and the subband unit channel correlation for the subband of the subdivided signal when the frame unit channel correlation is less than or equal to a first predetermined value. do.

상기 파라메터 기반 채널 생성 수단은 상기 프레임 단위 채널 상관도가 상기 제1 소정의 값 이상인 경우, 상기 재분할된 신호의 서브밴드별 채널간 주파수 크기차와 서브밴드 단위 채널 상관도를 이용하여 복호화한다. 상기 파라메터 기반 채널 생성 수단은 상기 재분할된 신호의 서브밴드별 채널간 시간 지연차를 추가적으로 이용하여 복호화한다.When the frame-based channel correlation is greater than or equal to the first predetermined value, the parameter-based channel generating unit decodes the subband-based channel magnitude and the subband-specific channel correlation between the subdivided signals. The parameter-based channel generating means additionally decodes the time delay difference between the subbands of the subdivided signals.

상기 파라메터 기반 채널 생성 수단은 서브 밴드 b에 대한 채널 ID가 c'(1≤ c'≤C, C는 채널수)일 경우, c' 채널의 서브밴드 b 신호는 단일채널 오디오 신호의 서브밴드 b 신호로 할당한다. 또한, 상기 파라메터 기반 채널 생성 수단은 상기 단일채널 오디오 신호에 채널간에서 발생하는 음원의 공간감에 관한 성분을 반영하여 나머지 서브밴드 신호로 할당한다.In the parameter-based channel generating means, when the channel ID for the subband b is c '(1≤c'≤C, C is the number of channels), the subband b signal of the c' channel is the subband b of the single channel audio signal. Assign it as a signal. In addition, the parameter-based channel generating means allocates the remaining subband signal to the single channel audio signal by reflecting components related to the spatial feeling of the sound source generated between channels.

본 발명에 따른 다채널 오디오 신호 부호화 방법은 다채널 오디오 신호를 주파수 영역에서 분할하는 단계; 상기 분할된 신호를 단일채널로 다운믹스(dowm-mix)하여 부호화하는 단계; 상기 분할된 신호에 대해 프레임 단위의 채널 상관도를 측정하는 단계; 상기 프레임 단위의 채널 상관도에 따라 상기 분할된 신호를 재분할하는 단계; 및 상기 프레임 단위 채널 상관도가 제1 소정의 값 이하인 경우, 상기 재분할된 신호의 서브밴드에 대한 채널 ID와 서브밴드 단위 채널 상관도를 추출하여 부호화하는 단계를 포함하는 것을 특징으로 한다. The multi-channel audio signal encoding method according to the present invention comprises the steps of: dividing the multi-channel audio signal in the frequency domain; Down-mixing the divided signal into a single channel and encoding the divided signal; Measuring a channel correlation in units of frames with respect to the divided signal; Repartitioning the divided signal according to the channel correlation in the frame unit; And extracting and encoding a channel ID and a subband unit channel correlation for the subbands of the subdivided signal when the frame unit channel correlation is less than or equal to a first predetermined value.

본 발명에 따른 다채널 오디오 신호 복호화 방법은, 주파수 영역에서 분할된 다채널 오디오 신호를 단일채널로 다운믹스하여 부호화하고, 상기 분할된 다채널 오디오 신호에 대한 프레임 단위의 채널 상관도에 따라 상기 분할된 신호를 재분할함으로써 부호화된 오디오 신호를 다채널 오디오 신호로 복호화하는 다채널 오디오 신호 복호화 방법에 있어서, 상기 부호화된 단일채널 오디오 신호를 복호화하는 단계; 상기 프레임 단위 채널 상관도에 따라, 상기 재분할시에 사용된 서브밴드 구간과 동일하게 상기 복호화된 단일채널 오디오 신호를 주파수 영역에서 분할하는 단계; 및 상기 프레임 단위 채널 상관도가 제1 소정의 값 이하인 경우 상기 재분할된 신호의 서브밴드에 대한 채널 ID와 서브밴드 단위 채널 상관도를 이용하여 상기 단 일채널 오디오 신호로 복호화하는 단계를 포함하는 것을 특징으로 한다. In the multi-channel audio signal decoding method according to the present invention, the multi-channel audio signal divided in the frequency domain is downmixed and encoded into a single channel, and the division is performed according to the channel correlation in the frame unit for the divided multi-channel audio signal. CLAIMS 1. A multichannel audio signal decoding method for decoding an encoded audio signal into a multichannel audio signal by re-dividing an encoded signal, comprising: decoding the encoded single channel audio signal; Dividing the decoded single channel audio signal in a frequency domain in the same manner as the subband interval used in the re-division according to the frame-by-frame channel correlation; And decoding the single channel audio signal using a channel ID for the subband of the subdivided signal and the subband unit channel correlation when the frame unit channel correlation is less than or equal to a first predetermined value. It features.

상술한 목적 및 특징은 첨부된 도면과 관련한 다음의 상세한 설명을 통하여 보다 분명해질 것이며, 그에 따라 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 또한 본 발명을 설명함에 있어서 본발명과 관련된 공지기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략하기로 한다. 이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 일 실시예를 상세히 설명하기로 한다. The above objects and features will become more apparent from the following detailed description taken in conjunction with the accompanying drawings, whereby those skilled in the art to which the present invention pertains may easily implement the technical idea of the present invention. . In addition, in describing the present invention, when it is determined that the detailed description of the known technology related to the present invention may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 다채널 오디오 신호 부/복호화 장치의 블럭 구성도이다. C개의 채널을 갖는 다채널 오디오 신호는 다채널 파라메트릭 부호화부(101)에 입력된다. 다채널 파라메트릭 부호화부(101)는 입력된 다채널 오디오 신호를 다운믹스(down-mix)하여 로(raw) 데이터 형태의 단일 채널 오디오 신호를 출력한다. 또한, 다채널 파라메트릭 부호화부(101)는 채널 상관성에 기반하여 다채널 신호로 복호화하기 위한 큐 파라메터(cue parameter) 정보를 추출하여 출력한다. 1 is a block diagram of a multi-channel audio signal encoding / decoding apparatus according to the present invention. The multichannel audio signal having C channels is input to the multichannel parametric encoder 101. The multi-channel parametric encoder 101 down-mixes the input multi-channel audio signal and outputs a single channel audio signal in the form of raw data. In addition, the multichannel parametric encoder 101 extracts and outputs cue parameter information for decoding into a multichannel signal based on channel correlation.

큐 파라메터 정보에는 채널 상관성에 따라 채널간 주파수 크기차(Inter-Channel Level Difference; ICLD) 또는 채널 ID(IDentification)가 포함된다. 보다 구체적으로, 채널 상관도(Inter-Chanenel Correalation; ICC)가 소정의 값 이상인 경우 ICLD가 사용된다. 이 때, 채널간 시간차(Inter-Channel Time Difference; ICTD)가 부가적으로 사용될 수 있다. 또한, 채널 상관도가 소정의 값 이하인 경우 채널 ID가 사용된다. 이 때, ICTD가 부가적으로 사용될 수 있다.The cue parameter information includes Inter-Channel Level Difference (ICLD) or Channel ID (IDentification) according to channel correlation. More specifically, ICLD is used when Inter-Chanenel Correalation (ICC) is greater than or equal to a predetermined value. In this case, an inter-channel time difference (ICTD) may be additionally used. In addition, the channel ID is used when the channel correlation is less than or equal to a predetermined value. At this time, ICTD may additionally be used.

다채널 파라메트릭 부호화부(101)로부터 출력된 단일 채널 오디오 신호는 일반 오디오 부호화부(103)에 의해 특정 파일 포맷으로 부호화된다. 일반 오디오 부호화부(103)는 mp3 인코더와 같이 일반적으로 알려진 통상의 오디오 부호화기이다.The single channel audio signal output from the multichannel parametric encoder 101 is encoded by the general audio encoder 103 into a specific file format. The general audio encoder 103 is a general audio encoder generally known as an mp3 encoder.

일반 오디오 복호화부(105)는 일반 오디오 부호화부(103)에 대응되며, 특정 파일 포맷의 음성 파일을 복호화하여 로 데이터 형태의 단일 채널 오디오 신호를 출력한다. 다채널 파라메트릭 복호화부(107)는 다채널 파라메트릭 부호화부(101)에 의해서 생성된 큐 파라메터 정보를 이용하여 입력된 단일 채널 오디오 신호를 다채널 오디오 신호로 복원하여 출력한다. 보다 구체적으로, 채널 상관성이 소정의 값 이하인 경우, ICLD를 이용하여 단일 채널 오디오 신호를 다채널 오디오 신호로 복원한다. 또한, 채널 상관성이 소정의 값 이상인 경우, 채널 ID를 이용하여 단일 채널 오디오 신호를 다채널 오디오 신호를 복원한다. The general audio decoder 105 corresponds to the general audio encoder 103, and decodes a voice file having a specific file format and outputs a single channel audio signal having a raw data format. The multichannel parametric decoder 107 reconstructs and outputs the single channel audio signal input to the multichannel audio signal using the cue parameter information generated by the multichannel parametric encoder 101. More specifically, when channel correlation is less than or equal to a predetermined value, ICLD is used to restore a single channel audio signal to a multichannel audio signal. In addition, when the channel correlation is greater than or equal to a predetermined value, the channel ID is used to restore the single channel audio signal to the multichannel audio signal.

도 2는 도 1의 다채널 파라메트릭 부호화부(101)의 세부 블럭 구성도이다. 각 채널 신호는 TF(Time Frequency) 변환부(201)에서 프레임 단위로 주파수 변환 예를 들어, 고속 푸리에 변환(Fast Fourier Transform; FFT)된다. FFT 변환된 각 채널 신호는 신호 분할부(203)에서 B개의 서브밴드로 분할된다. b(1≤b≤B)번째 서브밴드는 분할 경계값, A_b-1과 A_b-1 사이의 FFT 주파수 계수(spectral coefficient)로 이루어진다.FIG. 2 is a detailed block diagram of the multichannel parametric encoder 101 of FIG. 1. Each channel signal is frequency-transformed, for example, by Fast Fourier Transform (FFT), on a frame-by-frame basis by a time frequency (TF) converter 201. Each FFT-converted channel signal is divided into B subbands by the signal splitter 203. The b (1≤b≤B) th subband consists of a split boundary value, an FFT spectral coefficient between A _b-1 and A _b -1.

예를 들어, 1024 포인트의 FFT 변환된 각 채널 신호는 신호 분할부(203)에서 표1에 의해 19개의 서브밴드로 분할될 수 있다. 이 경우, 첫번째 서브밴드는 0,1,2 번째 주파수 계수로 이루어지고, 두번째 서브밴드는 3,4,5번째 주파수 계수로 이루어질 것이다. 여기서, 주파수 계수의 대칭성으로 인해 513번째 주파수 계수까지만 고려된다. For example, each FFT-converted channel signal of 1024 points may be divided into 19 subbands by Table 1 in the signal splitter 203. In this case, the first subband will consist of 0, 1, 2nd frequency coefficients, and the second subband will consist of 3, 4, 5th frequency coefficients. Here, only the 513th frequency coefficient is considered due to the symmetry of the frequency coefficient.

서브밴드 분할 방법으로 청각특성이 반영되도록 임계밴드(critical band) 분할과 유사하게 등량의 사각 밴드(Equivalent Rectangular Band: ERB)가 사용된다. 분할된 주파수 영역 서브밴드 신호들은 다운믹스부(205)에서 서브밴드 단위로 혼합하여 다운믹스됨으로써, 주파수 영역의 단일 채널 프레임 신호가 생성된다. Equivalent Rectangular Band (ERB) is used similarly to the critical band division so that the auditory characteristics are reflected in the subband division method. The divided frequency domain subband signals are mixed downmix by the downmix unit 205 in subband units, thereby generating a single channel frame signal in the frequency domain.

다운믹스된 단일 채널 프레임 신호는 역 TF(Inverse Time Frequency) 변환부(207)에서 역 고속 퓨리에 변환(Inverse FFT)됨으로써, 일반 오디오 디코더의 입력 웨이브 신호로 사용될 수 있는 단일 채널 오디오 신호가 생성된다. The downmixed single channel frame signal is inverse fast Fourier transform (Inverse FFT) by the inverse time frequency (TF) converter 207 to generate a single channel audio signal that can be used as an input wave signal of a general audio decoder.

한편, 19밴드로 분할된 주파수 영역 서브밴드 신호들은 파라메터 분석부(209)에도 입력된다. 파라메터 분석부(209)에서는 채널 상관도를 측정한다. 또한, 파라메터 분석부(209)는 측정된 채널 상관도에 기초하여 단일 채널 오디오 신호로부터 다채널 오디오 신호를 복원하는 데 필요한 큐 파라메터 정보를 추출한다. 파라메터 분석부(209)에 대한 보다 상세한 설명은 도 3을 참조하여 이루어질 것이다.Meanwhile, the frequency domain subband signals divided into 19 bands are also input to the parameter analyzer 209. The parameter analyzer 209 measures channel correlation. In addition, the parameter analyzer 209 extracts cue parameter information necessary to restore the multichannel audio signal from the single channel audio signal based on the measured channel correlation. A more detailed description of the parameter analyzer 209 will be made with reference to FIG. 3.

도 3은 도 2의 파라메터 분석부(209)의 상세 블럭 구성도이다. 채널 상관도 측정부(301)는 분할된 주파수 영역 서브밴드 신호들을 대상으로 서브밴드 단위 채널 상관도를 측정한다. 서브밴드 단위 채널 상관도는 수학식 1과 같이 정규화된 상호 상관함수(coherence function)를 이용하여 구해진다. 3 is a detailed block diagram of the parameter analyzer 209 of FIG. 2. The channel correlation measurer 301 measures a channel correlation between subbands based on the divided frequency domain subband signals. The subband unit channel correlation is calculated using a normalized coherence function as shown in Equation 1 below.

여기서,

는 i번째 채널 신호의 b번째 서브밴드 신호와 j번째 채널 신호의 b번째 서브밴드 신호간의 상관도이다. x^f _i,b는 i번째 채널 신호의 b번째 서브밴드의 주파수 계수이다. 위첨자 f는 주파수 영역임을 표현한 것이며 *는 해당 복소수의 공액복소수 변환을 나타낸다. i는 기준채널로서 고정되어 있는 것으로 간주한다. 그러므로 채널 수가 C개일 때, 각 서브밴드별로 C-1개의 상관도가 측정된다. 기준채널로 1번째 채널 신호가 사용되는 경우, 서브밴드 b에 대한 채널 상관도 계수 Γ_b는 수학식 2로 표현된다. 기준채널로서는 오디오 신호의 주요 전력을 제 공하는 센터(cneter) 채널 신호가 사용된다.here,

Is a correlation between the b th subband signal of the i th channel signal and the b th subband signal of the j th channel signal. x ^f _{i, b} is the frequency coefficient of the b th subband of the i th channel signal. The superscript f denotes the frequency domain, and * denotes a conjugate complex conversion of the corresponding complex number. i is considered to be fixed as a reference channel. Therefore, when the number of channels is C, C-1 correlations are measured for each subband. When the first channel signal is used as the reference channel, the channel correlation coefficient Γ _b for the subband _b is expressed by Equation 2. As the reference channel, a center channel signal that provides the main power of the audio signal is used.

한 프레임에 대한 채널 상관도 계수 Γ는 수학식 3으로부터 구해질 수 있다.The channel correlation coefficient Γ for one frame may be obtained from Equation 3.

N≒M/2로서, M이 19인 경우 N은 10으로 결정된다. α는 스무딩 인자(smoothing factor)로서 초기 N개의 저역 신호가 프레임 단위 채널 상관도 계수에 주로 영향을 미치도록 0.5 이하의 작은 값을 가지는 것이 바람직하다. As N ≒ M / 2, when M is 19, N is determined to be 10. α is a smoothing factor, and preferably has a small value of 0.5 or less so that the initial N low-pass signals mainly influence the channel correlation coefficient per frame.

채널 상관도 측정부(301)는 프레임 단위 채널 상관도 계수 Γ를 큐 파라메터로서 다채널 파라메트릭 복호화부(107)에 전송한다.The channel correlation measurer 301 transmits the frame correlation coefficient Γ as a cue parameter to the multi-channel parametric decoder 107.

채널 상관도 계수 Γ에 따라, 다채널 오디오 신호를 복원하기 위해 필요한 큐파라메터의 종류가 달리 결정된다. 또한, 채널 상관도 계수 Γ에 따라, 큐파라메터를 추출하는 서브밴드의 개수가 상이하게 결정된다.According to the channel correlation coefficient Γ, the type of cue parameter required for reconstructing the multichannel audio signal is determined differently. Further, according to the channel correlation coefficient Γ, the number of subbands for extracting the cue parameter is determined differently.

채널 상관도 계수 Γ와 큐파라메터를 추출하는 서브밴드 개수와의 관계는 표2와 같이 결정될 수 있다. The relationship between the channel correlation coefficient Γ and the number of subbands for extracting the cue parameter may be determined as shown in Table 2.

즉, 채널 상관도 계수 Γ가 크면 소수개로 분할된 서브밴드 분할 신호의 각 서브밴드로부터 큐 파라메터를 추출하고, 반대로 채널 상관도 계수 Γ가 작으면 다수개로 분할된 서브밴드 신호의 각 서브밴드로부터 큐 파라메터를 추출한다.That is, if the channel correlation coefficient Γ is large, the cue parameter is extracted from each subband of the subband split signal divided into a small number. On the contrary, if the channel correlation coefficient Γ is small, the cue parameter is extracted from each subband of the multiple subband signal. Extract the parameters.

예를 들어, 채널 상관도 계수 Γ가 0.9인 경우, 10 서브밴드 파라메터 분석부(303)는 입력된 19 서브밴드 분할 신호들에 대해 주파수 계수들을 표 3에 따라 재분할한다. 즉, 19개의 서브밴드를 갖는 분할 신호는 10 서브밴드 파라메터 분석부(303)에 의해 10개의 서브밴드를 갖는 분할 신호로 변형된다.For example, when the channel correlation coefficient Γ is 0.9, the 10 subband parameter analyzer 303 repartitions frequency coefficients according to Table 3 with respect to the input 19 subband split signals. That is, the divided signal having 19 subbands is transformed into a divided signal having 10 subbands by the 10 subband parameter analyzer 303.

이후, 10 서브밴드 파라메터 분석부(303)는 10개의 서브밴드 별로 다채널 파라메트릭 복호화부(107)에 전송해야 할 큐 파라메터를 추출한다. 큐 파라메터로는 채널간 주파수 크기차(ICLD)가 사용된다. ICLD는 일반적으로 센터 채널인 기준채널에 대한 서브밴드의 전력비를 나타내는 파라메터이다. 또한, 채널간 시간차(ICTD)가 추가적으로 큐 파라메터로서 사용될 수 있다. ICTD는 각 서브 밴드에서 기준채널과의 시간 지연차를 나타내는 파라메터로서, 기준채널과 다른 채널간의 상관함수로부터 신호간의 최대 상관성이 제공되도록 구해진다. ICTD는 오디오 음질에 크게 영향을 미치지 않으므로 큐 파라메터로서 선택적으로 사용될 수 있다.Thereafter, the 10 subband parameter analyzer 303 extracts the queue parameters to be transmitted to the multichannel parametric decoder 107 for each of the 10 subbands. The channel size difference (ICLD) is used as the cue parameter. ICLD is a parameter that indicates the power ratio of a subband to a reference channel, which is typically a center channel. In addition, the inter-channel time difference (ICTD) can additionally be used as a cue parameter. ICTD is a parameter representing a time delay difference between a reference channel in each subband, and is obtained so that maximum correlation between signals is provided from a correlation function between the reference channel and another channel. ICTD can be selectively used as a cue parameter because it does not significantly affect the audio quality.

ICLD는 수학식 4와 5에 의해서 구해진다.ICLD is obtained by equations (4) and (5).

여기서, △L_c _-1,b는 c번째 채널에서 b번째 서브밴드에 대한 ICLD 값이며, 1번째 채널이 기준채널로 사용된 것이다. S_c,n는 c번째 채널에서 n-포인트 주파수 계수이고, P_c _,b는 c번째 채널에서 b번째 서브밴드의 전력이다. 채널수가 C 개이면, 한 서브밴드에 대하여 C-1개의 ICLD가 추출된다. ΔL _c _{−1, b} is an ICLD value for the b th subband in the c th channel, and the first channel is used as the reference channel. S _{c, n} is the _n -point frequency coefficient in the c-th channel, P _c _{, b} is the power of the b-th subband in the c-channel. If the number of channels is C, C-1 ICLDs are extracted for one subband.

ICLD와 ICTD의 추출과정은 논문 "Binaural Cue Coding-partII: schemes and application"(IEEE Trans. on speech and audio Proc. Vol. 11. No.6. Nov. 2003)에 게재된 바와 같이 잘 알려져 있으므로 자세한 설명은 생략한다.The extraction process of ICLD and ICTD is well known as published in the article "Binaural Cue Coding-part II: schemes and application" (IEEE Trans. On speech and audio Proc. Vol. 11. No.6. Nov. 2003). Description is omitted.

10 서브밴드 파라메터 분석부(303)는 표3의 서브밴드 구간에 따라 10개의 서브밴드에 대해 채널 상관도 계수 Γ_b를 다시 추출하여 큐파라메터로서 다채널 파라메트릭 복호화부(107)에 전송한다.The 10 subband parameter analyzer 303 re-extracts the channel correlation coefficient Γ _b for the 10 subbands according to the subband intervals of Table 3 and transmits the channel correlation coefficient Γ _b to the multichannel parametric decoder 107 as a cue parameter.

채널 상관도 계수 Γ가 0.5와 0.8 사이인 경우, 19 서브밴드 파라메터 분석부(305)는 입력된 19 서브밴드 분할 신호들에 대해 19개의 서브밴드별로 큐 파라메터를 추출한다. 채널 상관도 계수 Γ가 0.8과 1.0 사이인 경우와 같이, 큐 파라메터로는 채널간 주파수 크기차(ICLD)가 사용된다. 또한, 채널간 시간차(ICTD)가 추가적으로 큐 파라메터로서 사용될 수 있다. 이 경우, 채널 상관도 측정부(301)에서 계산된 서브밴드 단위 채널 상관도 계수와 동일한 서브밴드 단위 채널 상관도 계수가 다채널 파라메트릭 복호화부(107)에 전송된다.When the channel correlation coefficient Γ is between 0.5 and 0.8, the 19 subband parameter analyzer 305 extracts a cue parameter for each of 19 subbands for the 19 subband split signals. As in the case where the channel correlation coefficient Γ is between 0.8 and 1.0, the inter-channel frequency magnitude difference (ICLD) is used as the cue parameter. In addition, the inter-channel time difference (ICTD) can additionally be used as a cue parameter. In this case, the subband unit channel correlation coefficient equal to the subband unit channel correlation coefficient calculated by the channel correlation measurer 301 is transmitted to the multichannel parametric decoder 107.

채널 상관도 계수 Γ가 0과 0.5 사이인 경우, 32 서브밴드 파라메터 분석부(303)는 입력된 19 서브밴드 분할 신호들에 대해 주파수 계수들을 표 4에 따라 재분할한다. 즉, 19개의 서브밴드를 갖는 분할 신호는 32 서브밴드 파라메터 분석부(303)에 의해 32개의 서브밴드를 갖는 분할 신호로 변형된다.When the channel correlation coefficient Γ is between 0 and 0.5, the 32 subband parameter analyzer 303 repartitions frequency coefficients according to Table 4 with respect to the input 19 subband split signals. That is, the divided signal having 19 subbands is transformed into a divided signal having 32 subbands by the 32 subband parameter analyzer 303.

이후, 32 서브밴드 파라메터 분석부(307)는 32개의 서브밴드 별로 다채널 파라메트릭 복호화부(107)에 전송하여야 할 큐 파라메터를 추출한다. 큐 파라메터로는 ICLD 대신에 각 서브밴드에 대하여 부여된 채널 ID(Identification)가 사용된다. 여기서, 서브밴드에 대한 채널 ID( I_b)는 임의의 서브밴드 b에 대하여 최대 전력을 갖는 채널의 인덱스를 의미한다. 이를 수식으로 표현하면 수학식 6과 같다.Thereafter, the 32 subband parameter analyzer 307 extracts the queue parameters to be transmitted to the multichannel parametric decoder 107 for each of the 32 subbands. As a cue parameter, a channel ID assigned to each subband is used instead of ICLD. Here, the channel ID I _{b for} the subband refers to the index of the channel having the maximum power for any subband b. If this is expressed as an equation, Equation 6 is obtained.

다채널 파라메트릭 복호화부(107)에 전송할 큐 파라메터로서 ICLD 대신에 채널 ID를 사용함으로써 큐 파라메터 전송량이 획기적으로 감소화게 된다. 예를 들어, 5채널의 신호인 경우, 필요한 큐 파라메터는 각 서브밴드당 1개씩 총 32개가 소요된다. 만약 ICLD가 사용된다면 각 서브밴드당 4개의 ICLD가 소요되므로 총 128개(= 4 x 32)가 소요될 것이다. 단, 서브밴드 개수가 작은 경우에도 큐 파라메터로서 채널 ID를 사용하면 신호의 열화가 크게 되므로, 서브밴드 개수가 32개와 같이 충분히 큰 경우에 다채널 파라메트릭 복호화부(107)에 전송할 큐 파라메터로서 채널 ID를 사용하는 것이 바람직하다.By using the channel ID instead of ICLD as the queue parameter to be transmitted to the multi-channel parametric decoding unit 107, the amount of queue parameter transmission is greatly reduced. For example, in case of a 5-channel signal, a total of 32 required cue parameters are required for each subband. If ICLD is used, it will take 128 ICs (= 4 x 32) since 4 ICLDs are required for each subband. However, even when the number of subbands is small, if the channel ID is used as the cue parameter, the signal deterioration becomes large. Therefore, when the number of subbands is large enough, such as 32, the channel as a queue parameter to be transmitted to the multi-channel parametric decoder 107 is used. It is preferable to use ID.

물론, 채널간 시간차(ICTD)가 추가적으로 큐 파라메터로서 사용될 수 있다. Of course, the inter-channel time difference (ICTD) can additionally be used as a cue parameter.

32 서브밴드 파라메터 분석부(307)는 표4의 서브밴드 구간에 따라 32개의 서브밴드에 대해 채널 상관도 계수 Γ_b를 다시 추출하여 큐파라메터로서 다채널 파라메트릭 복호화부(107)에 전송한다.The 32 subband parameter analyzer 307 re-extracts the channel correlation coefficient Γ _b for the 32 subbands according to the subband intervals of Table 4 and transmits the channel correlation coefficient Γ _b to the multichannel parametric decoder 107 as a cue parameter.

도 4는 도 1의 다채널 파라메트릭 복호화부(107)의 세부 블럭 구성도이다. 4 is a detailed block diagram of the multi-channel parametric decoding unit 107 of FIG. 1.

로 데이터 형태의 단일 채널 오디오 신호는 TF 변환부(401)에서 FFT 변환되어 주파수 영역의 단일 채널 오디오 신호로 된다. 파라메터 기반 다채널 생성부(403)는 파라메터 분석부(209)로부터 전송된 큐 파라메터 정보를 이용하여 주파수 영역의 단일 채널 오디오 신호로부터 주파수 영역의 다채널 오디오 신호를 생성한다. 주파수 영역의 다채널 오디오 신호는 채널별로 역 TF 변환부(405)에서 역 FFT 변환됨으로써 시간 영역의 다채널 오디오 신호가 생성된다.The single channel audio signal in raw data form is FFT-converted by the TF converter 401 to become a single channel audio signal in a frequency domain. The parameter-based multi-channel generator 403 generates a multi-channel audio signal in the frequency domain from the single channel audio signal in the frequency domain using the cue parameter information transmitted from the parameter analyzer 209. The multi-channel audio signal in the frequency domain is inverse FFT transformed by the inverse TF converter 405 for each channel to generate a multi-channel audio signal in the time domain.

도 5는 도4의 파라메터 기반 다채널 생성부(403)의 세부 블럭 구성도이다.FIG. 5 is a detailed block diagram of the parameter-based multichannel generator 403 of FIG. 4.

상관도 관측부(501)는 파라메터 분석부(209)로부터 전송된 프레임 단위 채널 상관도 계수 Γ를 표1과 비교하여 대응되는 서브밴드 파라메터 기반 생성부(503,505,507)을 선택한다. The correlation measurer 501 selects a corresponding subband parameter based generator 503, 505, 507 by comparing the frame unit channel correlation coefficient Γ transmitted from the parameter analyzer 209 with Table 1.

예를 들어, 채널 상관도 계수 Γ가 0.9인 경우, 10 서브밴드 파라메터 생성부(503)는 입력된 단일 채널 오디오 신호를 표3에 따라 10 서브밴드로 분할한다. 10 서브밴드 파라메터 생성부(503)는 분할된 신호에 대해, 10 서브밴드 파라메터 분석부(303)로부터 전송된 ICLD와 서브밴드 단위 채널 상관도 계수 Γ_b를 이용하여 다채널 오디오 신호를 생성한다. 이 때, 수학식 7,8,9가 사용될 수 있다.For example, when the channel correlation coefficient Γ is 0.9, the 10 subband parameter generator 503 divides the input single channel audio signal into 10 subbands according to Table 3. The 10 subband parameter generator 503 generates a multichannel audio signal with respect to the divided signal using the ICLD transmitted from the 10 subband parameter analyzer 303 and the subband unit channel correlation coefficient Γ _b . In this case, Equations 7,8, and 9 may be used.

여기서 S_n ^b은 다운믹스된 단일채널 오디오 신호의 b번째 서브밴드의 n 포인트 주파수 계수이고,

는 c번째 채널 b번째 서브밴드의 n포인트 주파수 계수이다.

는 전력 이득 인자(power gain factor)로서 수학식 8로 표현된다.Where S _n ^b is the n point frequency coefficient of the b th subband of the downmixed single channel audio signal,

Is the n-point frequency coefficient of the c-th channel b-th subband.

Is a power gain factor represented by Equation (8).

r_i,n은 채널간에서 발생하는 음원의 공간감을 반영하기 위한 랜덤 변수이다. 즉, 복원되는 채널 신호는 상호간의 공간감을 유지하여야 하나, 모든 채널 신호는 다운믹스 신호로부터 복원됨으로써 공간감이 줄어드는 현상이 발생한다. 이러한 공간감을 살리기 위하여 랜덤한 변수 값을 전력 이득 인자를 계산하는 과정에 삽입시켜서, 복원 채널 신호에 무작위성을 가미함으로써 보다 넓은 공간감을 얻을 수 있다. r _{i, n} is a random variable to reflect the spatial feeling of the sound source generated between the channels. That is, although the channel signals to be restored must maintain a sense of space between each other, all channel signals are recovered from the downmix signal, thereby reducing the sense of space. In order to save the sense of space, a wider sense of space can be obtained by inserting random variable values into the process of calculating the power gain factor and adding randomness to the reconstructed channel signal.

r_i,n은 수학식 9와 같이 표현될 수 있다.r _{i, n} may be expressed as in Equation (9).

여기서,

은 ±5 dB의 크기변화를 갖는 랜덤신호이다.

은 모든 채널과 서브밴드에 적용되며 동일한 분산과 제로 평균을 갖는다. here,

Is a random signal with a magnitude change of ± 5 dB.

Applies to all channels and subbands and has the same variance and zero mean.

전술한 수학식 7, 8, 9를 통해 다채널 오디오 신호를 생성하는 방법은 상기 논문에 기재된 바와 같이 잘 알려져 있으므로 더 이상의 상세한 설명은 생략한다.The method for generating a multi-channel audio signal through the above Equations 7, 8, and 9 is well known as described in the above paper, and thus, further description thereof will be omitted.

19 서브밴드 파라메터 기반 생성부(505)는 프레임 단위 채널 상관도 계수 Γ가 0.5와 0.8 사이인 경우에 동작하며, 10 서브밴드 파라메터 기반 생성부(503)와 동일한 방법으로 다채널 오디오 신호를 생성한다.The 19 subband parameter based generator 505 operates when the frame correlation coefficient Γ is 0.5 and 0.8, and generates a multichannel audio signal in the same manner as the 10 subband parameter based generator 503. .

32 서브밴드 파라메터 기반 생성부(507)는 프레임 단위 채널 상관도 계수 Γ가 0과 0.5 사이인 경우에 동작한다. 32 서브밴드 파라메터 기반 생성부(507)는 단일 채널 오디오 신호를 표4에 따라 32 서브밴드로 분할한다. 32 서브밴드 파라메터 생성부(507)는 분할된 신호에 대해, 32 서브밴드 파라메터 분석부(307)로부터 전송된 채널 ID와 서브밴드 단위 채널 상관도 계수 Γ_b를 이용하여 다채널 오디오 신호를 생성한다. The 32 subband parameter based generation unit 507 operates when the frame correlation coefficient Γ is 0 and 0.5. The 32 subband parameter based generation unit 507 divides the single channel audio signal into 32 subbands according to Table 4. The 32 subband parameter generator 507 generates a multichannel audio signal with respect to the divided signal using the channel ID transmitted from the 32 subband parameter analyzer 307 and the subband unit channel correlation coefficient Γ _b . .

서브 밴드 b에 대한 채널 ID가 c'(1≤c'≤C)일 경우 c' 채널의 서브밴드 b 신호는 수학식 10에 의해 할당된다.If the channel ID for the subband b is c '(1≤c'≤C), the subband b signal of the c' channel is allocated by Equation 10.

S^b _c',n=S^b _n S ^b _{c ', n} = S ^b _n

그 밖의 서브밴드 신호들은 수학식 7, 11, 12에 의해 생성된다.The other subband signals are generated by Equations 7, 11 and 12.

여기서, r_i,n은 채널간에서 발생하는 음원의 공간감을 반영하기 위한 랜덤 변수이다.Here, r _{i, n} is a random variable for reflecting the spatial sense of the sound source generated between the channels.

여기서,

은 ±5 dB의 크기변화를 갖는 랜덤신호이다.

Is a random signal with a magnitude change of ± 5 dB.

Applies to all channels and subbands and has the same variance and zero mean.

여기서 은 r_i,n은 다음과 같이 정의 한다. Where r _{i, n} is defined as

서브밴드 파라메터 기반 생성기(503,505,507)에 의해 생성된 주파수 영역의 다채널 오디오 신호는 채널별로 역 TF 변환부(405)에 의해 역 FFT 변환됨으로써 시간영역의 다채널 오디오 신호가 복원된다.The multi-channel audio signal in the frequency domain generated by the subband parameter based generators 503, 505, and 507 is inversely FFT-converted by the inverse TF converter 405 for each channel to restore the multi-channel audio signal in the time domain.

전술한 본 발명에 따른 다채널 파라메트릭 부복호화 과정을 도6 및 도7을 참조하여 정리하면 다음과 같다. 도6은 본 발명에 따른 다채널 파라메트릭 부호화 절 차를 도시한 흐름도이다.The multi-channel parametric encoding and decoding process according to the present invention described above is summarized with reference to FIGS. 6 and 7 as follows. 6 is a flowchart illustrating a multi-channel parametric coding procedure according to the present invention.

다채널 오디오 신호는 FFT 변환된 후 표1에 따라 19개의 서브밴드로 분할된다(S601). 19 서브밴드 분할 신호에 대해 서브밴드 단위의 채널 상관도 계수 Γ_b가 측정되고, 이를 기초로 프레임 단위의 채널 상관도 계수 Γ가 측정된다(S603). The multi-channel audio signal is divided into 19 subbands according to Table 1 after FFT conversion (S601). The channel correlation coefficient Γ _b in units of subbands is measured with respect to the 19 subband split signal, and the channel correlation coefficient Γ in units of frames is measured (S603).

프레임 단위 채널 상관도 계수 Γ가 0.8보다 큰 경우, 19 서브밴드 분할 신호는 표3에 의해 10 서브밴드 신호로 재분할되고, 10개의 서브밴드에 대해 ICLD와 서브밴드 단위의 채널 상관도 계수 Γ_b가 추출된다(S605, S607, S609). 이후, S609단계에서 추출된 ICLD 및 서브밴드 단위 채널 상관도 계수Γ_b와, S603 단계에서 측정된 프레임 단위 채널 상관도 계수 Γ로 이루어진 큐 파라메터는 다채널 파라메트릭 복호화부에 전송된다(S611). If the frame-by-frame channel correlation coefficient Γ is greater than 0.8, the 19 subband split signal is subdivided into 10 subband signals according to Table 3, and the channel correlation coefficient Γ _b in ICLD and subband units for 10 subbands is It extracts (S605, S607, S609). Subsequently, the cue parameter including the ICLD and subband unit channel correlation coefficient Γ _b extracted in step S609 and the frame unit channel correlation coefficient Γ measured in step S603 is transmitted to the multi-channel parametric decoder (S611).

프레임 단위 채널 상관도 계수 Γ가 0.5와 0.8 사이인 경우, 19 서브밴드 분할 신호의 19개의 서브밴드에 대해 ICLD가 추출된다(S605, S607, S613). 이후, S613단계에서 추출된 ICLD와, S603 단계에서 측정된 서브밴드 단위 채널 상관도 계수Γ_b 및 프레임 단위 채널 상관도 계수 Γ로 이루어진 큐 파라메터는 다채널 파라메트릭 복호화부에 전송된다(S615).When the frame unit correlation coefficient Γ is between 0.5 and 0.8, ICLDs are extracted for 19 subbands of the 19 subband split signals (S605, S607, and S613). Subsequently, the queue parameter including the ICLD extracted in step S613 and the subband unit channel correlation coefficient Γ _b and the frame unit channel correlation coefficient Γ measured in step S603 are transmitted to the multichannel parametric decoder (S615).

프레임 단위 채널 상관도 계수 Γ가 0과 0.5 사이인 경우, 19 서브밴드 분할 신호는 표4에 의해 32 서브밴드 신호로 재분할되고, 32개의 서브밴드에 대해 채널 ID와 서브밴드 단위의 채널 상관도 계수 Γ_b가 추출된다(S605, S617). 이후, S617단 계에서 추출된 채널 ID 및 서브밴드 단위 채널 상관도 계수Γ_b와, S603 단계에서 측정된 프레임 단위 채널 상관도 계수 Γ로 이루어진 큐 파라메터는 다채널 파라메트릭 복호화부에 전송된다(S619).If the frame-by-frame channel correlation coefficient Γ is between 0 and 0.5, the 19 subband split signal is subdivided into 32 subband signals by Table 4, and the channel correlation coefficient of the channel ID and the subband unit for the 32 subbands is shown. Γ _b is extracted (S605, S617). Subsequently, the queue parameter including the channel ID and the subband unit channel correlation coefficient Γ _b extracted in step S617 and the frame unit channel correlation coefficient Γ measured in step S603 is transmitted to the multichannel parametric decoder (S619). ).

도7은 본 발명에 따른 다채널 파라메트릭 복호화 절차를 도시한 흐름도이다.7 is a flowchart illustrating a multichannel parametric decoding procedure according to the present invention.

단일채널 오디오 신호는 FFT 변환된다(S701). 다채널 파라메트릭 부호화부(101)로부터 전송된 프레임 단위 채널 상관도 계수 Γ가 관측된다(S703). The single channel audio signal is FFT converted (S701). The frame unit channel correlation coefficient Γ transmitted from the multichannel parametric encoder 101 is observed (S703).

프레임 단위 채널 상관도 계수 Γ가 0.8보다 큰 경우, FFT 변환된 신호는 표3에 의해 10 서브밴드 신호로 분할되고, 전송된 ICLD와 서브밴드 단위 채널 상관도 계수 Γ_b에 의해 서브밴드 단위별로 신호가 복호화된다(S705, S707, S709).If the frame-by-frame channel correlation coefficient Γ is greater than 0.8, the FFT-converted signal is divided into 10 subband signals by Table 3, and the signal is transmitted by subband units by the transmitted ICLD and sub-band unity channel correlation coefficient Γ _b . Is decoded (S705, S707, S709).

프레임 단위 채널 상관도 계수 Γ가 0.5와 0.8 사이인 경우, FFT 변환된 신호는 표1에 의해 19 서브밴드 신호로 분할되고, 전송된 ICLD와 서브밴드 단위 채널 상관도 계수 Γ_b에 의해 서브밴드 단위별로 신호가 복호화된다(S705, S707, S711).If the frame-by-frame channel correlation coefficient Γ is between 0.5 and 0.8, the FFT-converted signal is divided into 19 subband signals by Table 1, and the subband-by-band unit by the transmitted ICLD and subband unit channel correlation coefficients Γ _b . The signals are decoded for each other (S705, S707, S711).

프레임 단위 채널 상관도 계수 Γ가 0과 0.5 사이인 경우, FFT 변환된 신호는 표4에 의해 32 서브밴드 신호로 분할되고, 전송된 채널 ID와 서브밴드 단위 채널 상관도 계수 Γ_b에 의해 서브밴드 단위별로 신호가 복호화된다(S705, S713).When the frame unit correlation coefficient Γ is between 0 and 0.5, the FFT transformed signal is divided into 32 subband signals by Table 4, and the subbands are transmitted by the transmitted channel ID and the subband unit channel correlation coefficient Γ _b . The signal is decoded in units (S705 and S713).

상술한 바와 같은 본 발명의 방법은 프로그램으로 구현되어 컴퓨터로 읽을 수 있는 형태로 기록매체에 저장될 수 있다. The method of the present invention as described above may be implemented in a program and stored in a recording medium in a computer-readable form.

이상과 같이, 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 이것에 의해 한정되지 않으며 본 발명이 속하는 기술분야에서 통상의 지 식을 가진 자에 의해 본 발명의 기술 사상과 아래에 기재될 특허청구범위의 균등 범위 내에서 다양한 수정 및 변형이 가능함은 물론이다. As described above, although the present invention has been described by means of a limited embodiment and drawings, the present invention is not limited by this and the technical spirit of the present invention and the following by those skilled in the art to which the present invention pertains. Various modifications and variations are possible without departing from the scope of the claims to be described in the following.

Claims

Signal dividing means for dividing the multi-channel audio signal in the frequency domain;

Downmixing means for downmixing the divided signals into a single channel and encoding the divided signals; And

A parameter analysis means for measuring a channel correlation in units of frames with respect to the divided signal, and re-dividing the divided signals according to the channel correlation in units of frames;

The parameter analysis means,

A channel parameter for decoding when the frame unit channel correlation is less than or equal to a first predetermined value, and extracts and encodes a channel ID and a subband unit channel correlation for the subbands of the subdivided signal.

Multi-channel audio signal encoding apparatus.

The method of claim 1,

The parameter analyzing means

Re-segmenting the divided signal into an increasing number of subbands as the frame-by-frame channel correlation decreases.

Multi-channel audio signal encoding apparatus.

The method of claim 1,

The frame correlation is obtained from the channel correlation in subbands for the divided signals.

Multi-channel audio signal encoding apparatus.

The method of claim 3, wherein

The frame unit channel correlation is determined by the subband unit channel correlation of the low frequency band rather than the subband unit channel correlation of the high frequency band.

Multi-channel audio signal encoding apparatus.

The method of claim 1,

The channel ID for the subdivided subband b is

Is the index of the channel at which the power of the subdivided subband b is maximum

Multi-channel audio signal encoding apparatus.

The method of claim 1,

The parameter analyzing means

When the frame unit channel correlation is greater than or equal to the first predetermined value, an inter-channel level difference and a subband unit channel correlation are extracted for each subband of the subdivided signal. Coded

Multi-channel audio signal encoding apparatus.

The method according to claim 1 or 6,

The parameter analyzing means

As the cue parameter, an inter-channel level difference is additionally extracted for each subband of the subdivided signal.

Multi-channel audio signal encoding apparatus.

The method of claim 1,

The parameter analyzing means

When the frame correlation is less than or equal to the first predetermined value, the divided signal is re-divided into a number greater than the number of subbands of the divided signal,

When the frame unit channel correlation is greater than or equal to the second predetermined value greater than the first predetermined value, the divided signal is re-divided into a number smaller than the number of subbands of the divided signal,

When the frame unit channel correlation is between the first predetermined value and the second predetermined value, no subdivision is performed.

Multi-channel audio signal encoding apparatus.

The method of claim 8,

The parameter analyzing means

When the frame unit channel correlation is between the first predetermined value and the second predetermined value, an inter-channel level difference and a sub-channel frequency difference for each subband of the divided signal as the cue parameter. Extracting and coding channel correlation

Multi-channel audio signal encoding apparatus.

The multi-channel audio signal divided in the frequency domain is downmixed into a single channel and encoded, and the divided audio signal is re-divided according to the channel correlation in the frame unit for the divided multi-channel audio signal. In the multi-channel audio signal decoding apparatus for decoding into an audio signal,

Single channel decoding means for decoding the encoded single channel audio signal; And

And parameter-based multichannel generation means for dividing the decoded single channel audio signal in a frequency domain in the same manner as the subband interval used in the re-division, according to the frame unit channel correlation.

The parameter-based multi-channel generating means,

When the frame unit channel correlation is less than or equal to a first predetermined value, the signal is decoded into the single channel audio signal using a channel ID and a subband unit channel correlation of the subband of the re-divided signal.

Multi-channel audio signal decoding device.

The method of claim 10,

The parameter based channel generating means

Dividing the decoded single channel audio signal into an increasing number of subbands as the frame unit channel correlation decreases.

Multi-channel audio signal decoding device.

The method of claim 10,

The channel ID for the subdivided subband b is

Multi-channel audio signal decoding device.

The method of claim 10,

The parameter based channel generating means

When the frame unit channel correlation is greater than or equal to the first predetermined value, decoding is performed by using a subband-specific channel correlation and a subband-based channel correlation between the subdivided signals.

Multi-channel audio signal decoding device.

The method of claim 10 or 13,

The parameter based channel generating means

Decoding by additionally using a time delay difference between channels of each subband of the re-divided signal

Multi-channel audio signal decoding device.

The method of claim 12,

The parameter based channel generating means

If the channel ID for the subband b is c '(1≤c'≤C, C is the number of channels), the subband b signal of the c' channel is allocated to the subband b signal of the single channel audio signal.

Multi-channel audio signal decoding device.

The method of claim 15,

The parameter based channel generating means

The single channel audio signal is allocated to the remaining subband signals by reflecting components related to the spatial feeling of the sound source generated between channels.

Multi-channel audio signal decoding device.

Dividing the multichannel audio signal in a frequency domain;

Down-mixing the divided signal into a single channel and encoding the divided signal;

Measuring a channel correlation in units of frames with respect to the divided signal;

Repartitioning the divided signal according to the channel correlation in the frame unit; And

Extracting and encoding a channel ID and a subband unit channel correlation for the subbands of the re-divided signal when the frame unit channel correlation is less than or equal to a first predetermined value.

Multi-channel audio signal coding method.

The method of claim 17,

The repartition stage

Multi-channel audio signal coding method.

The method of claim 17,

Multi-channel audio signal coding method.

The method of claim 19,

Multi-channel audio signal coding method.

The method of claim 17,

The channel ID for the subdivided subband b is

Multi-channel audio signal coding method.

The method of claim 17,

Extracting and encoding an inter-channel level difference and a subband unit channel correlation for each subband of the subdivided signal when the frame unit channel correlation is greater than or equal to the first predetermined value. More containing

Multi-channel audio signal coding method.

The method of claim 17 or 23,

Extracting an inter-channel level difference for each subband of the re-divided signal;

Multi-channel audio signal coding method.

The method of claim 17,

The repartitioning step

Subdividing the divided signal into a number greater than the number of subbands of the divided signal when the frame unit channel correlation is less than or equal to the first predetermined value;

Subdividing the divided signal into fewer than the number of subbands of the divided signal when the frame unit channel correlation is greater than or equal to the second predetermined value greater than the first predetermined value; And

If the frame-by-frame channel correlation is between the first predetermined value and the second predetermined value, not re-dividing.

Multi-channel audio signal coding method.

The method of claim 24,

If the frame unit channel correlation is between the first predetermined value and the second predetermined value, extracting and encoding an inter-channel frequency magnitude difference and a subband unit channel correlation for each subband of the divided signal; Containing more

Multi-channel audio signal coding method.

The multi-channel audio signal divided in the frequency domain is downmixed into a single channel and encoded, and the divided audio signal is re-divided according to the channel correlation in the frame unit for the divided multi-channel audio signal. In the multi-channel audio signal decoding method for decoding into an audio signal,

Decoding the encoded single channel audio signal;

Dividing the decoded single channel audio signal in a frequency domain in the same manner as the subband interval used in the re-division according to the frame-by-frame channel correlation; And

Decoding the single channel audio signal using a channel ID and a subband unit channel correlation for the subband of the subdivided signal when the frame unit channel correlation is less than or equal to a first predetermined value.

Multi-channel audio signal decoding method.

The method of claim 26,

The dividing step

Multi-channel audio signal decoding method.

The method of claim 26,

The channel ID for the subdivided subband b is

Multi-channel audio signal decoding method.

The method of claim 26,

If the frame unit channel correlation is greater than or equal to the first predetermined value, decoding the subdivided signal using the sub-channel frequency magnitude difference between the subbands and the subband unit channel correlation.

Multi-channel audio signal decoding method.

The method of claim 26 or 29,

The method may further include decoding using a time delay difference between channels of each subband of the re-divided signal.

Multi-channel audio signal decoding method.

The method of claim 28,

Multi-channel audio signal decoding method.

The method of claim 31, wherein

The remaining subband signals are assigned to the single channel audio signal by reflecting components of the spatial sense of the sound source generated between channels.

Multi-channel audio signal decoding method.