JP2013050663A

JP2013050663A - Multi-channel sound coding device and program thereof

Info

Publication number: JP2013050663A
Application number: JP2011189741A
Authority: JP
Inventors: Akio Ando; 彰男安藤; Satoshi Oishi; 諭大石
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2011-08-31
Filing date: 2011-08-31
Publication date: 2013-03-14

Abstract

PROBLEM TO BE SOLVED: To provide a multi-channel sound coding device and a program thereof for performing bit allocation based on the relevance between channels of a transmission signal when encoding the transmission signal into which an input signal of a multi-channel sound method is converted by matrix conversion.SOLUTION: A multi-channel sound coding device 1 according to the present invention comprises: a matrix converting section 20 that converts, by matrix conversion, an input signal of a multi-channel sound method into a transmission signal having a plurality of channels; a calculating section 32 that calculates a scale factor for each of the channels on the basis of an auditory model for each channel of the transmission signal; a correlation calculating section 34 that calculates a correlation coefficient between channels of the transmission signal; a selecting section 35 that selects, for a channel group of the transmission signal having the correlation coefficient equal to or greater than a threshold, an applied scale factor based on the scale factors of the channel group; and a bit allocating section 36 that performs bit allocation according to the applied scale factor for the transmission signals of the channel group.

Description

この発明は、多チャネル音響符号化装置およびそのプログラムに関する。 The present invention relates to a multi-channel acoustic encoding apparatus and a program thereof.

社団法人電波産業会（ＡＲＩＢ）では、２２．２チャネル音響を伝送・符号化するために、既存のＡＡＣ（Advanced Audio Coding）符号化を用いる方式が標準化されている。ＡＡＣ方式は、ＣＤ（コンパクトディスク）の品質を保ったまま１／１０程度のビットレートで符号化可能な方式である（例えば、非特許文献１参照）。 The Japan Radio Industry Association (ARIB) standardizes a method using existing AAC (Advanced Audio Coding) coding to transmit and code 22.2 channel sound. The AAC method is a method capable of encoding at a bit rate of about 1/10 while maintaining the quality of a CD (compact disc) (for example, see Non-Patent Document 1).

２２．２チャネル音響の伝送・符号化を行う際に、伝送チャネル数を増やさずに従来の２チャネルステレオなどとの互換性を確保するためには、行列変換などの信号変換を行う必要がある。行列変換を用いた従来のＡＡＣ符号化について図３を用いて説明する。図３は、行列変換により２２.２チャネル音響信号である入力信号を２チャネルの伝送信号に変換してＡＡＣ伝送する処理の概要を示す図である。まず、送信ブロックでは、２２．２チャネルの入力信号が行列変換され、例えば、基本信号および補助信号の２つのチャネルを含む伝送信号に変換される。ここで、基本信号とは、２２．２チャネル音響信号の主要な空間情報を表す８〜１０チャネルの信号であり、補助信号とは、基本信号を補完して元の２２．２チャネル音響信号を復元するための信号である。 When performing transmission / coding of 22.2 channel sound, it is necessary to perform signal conversion such as matrix conversion in order to ensure compatibility with the conventional 2-channel stereo without increasing the number of transmission channels. . Conventional AAC encoding using matrix transformation will be described with reference to FIG. FIG. 3 is a diagram showing an outline of processing for converting an input signal, which is a 22.2 channel acoustic signal, into a 2-channel transmission signal by matrix conversion and performing AAC transmission. First, in the transmission block, the 22.2 channel input signal is subjected to matrix conversion, for example, converted into a transmission signal including two channels of a basic signal and an auxiliary signal. Here, the basic signal is a signal of 8 to 10 channels representing main spatial information of the 22.2 channel acoustic signal, and the auxiliary signal is the original 22.2 channel acoustic signal by complementing the basic signal. This is a signal for restoration.

次に、送信ブロックにおいては、基本信号および補助信号のＡＡＣ符号化がそれぞれ独立に行われる。ここで、ＡＡＣ符号化では、各伝送信号を周波数分析した後、顕著な周波数成分の検出を行い、この成分によって聞き取れなくなる（マスクされる）周波数成分の上限を表すマスキング曲線を算出し、マスキング曲線以下の周波数成分に対するビット割当てを削減するとともに、マスキング曲線以下に収まる量子化雑音を許容したビット割当てが行われる。 Next, in the transmission block, AAC encoding of the basic signal and the auxiliary signal is performed independently. Here, in AAC coding, after frequency analysis of each transmission signal, significant frequency components are detected, and a masking curve representing the upper limit of frequency components that cannot be heard (masked) by this component is calculated. Bit allocation for the following frequency components is reduced, and bit allocation that allows quantization noise that falls below the masking curve is performed.

送信ブロックから受信ブロックにＡＡＣ符号化された伝送信号が伝送されると、受信ブロックでは、基本信号および補助信号がそれぞれ独立にＡＡＣ復号が行われ、逆行列変換により２２．２チャネル音響信号が復元される。 When an AAC-encoded transmission signal is transmitted from the transmission block to the reception block, the basic signal and the auxiliary signal are independently AAC decoded in the reception block, and the 22.2 channel acoustic signal is restored by inverse matrix transformation. Is done.

Marina Bosi, Richard E. Goldberg, "Introduction to Digital Audio Coding and Standards" Springer, 2002-12-31Marina Bosi, Richard E. Goldberg, "Introduction to Digital Audio Coding and Standards" Springer, 2002-12-31

ここで、従来のＡＡＣ符号化では、２２．２チャネル音響の入力信号をよりチャネル数の少ない伝送信号にダウンミックスして伝送する場合のように、行列変換を伴う信号の符号化に対する処理が検討されていない。 Here, in the conventional AAC coding, processing for signal coding accompanied by matrix transformation is examined, as in the case of transmitting a 22.2 channel acoustic input signal by downmixing to a transmission signal having a smaller number of channels. It has not been.

例えば、図３に示す従来のＡＡＣ符号化では、送信ブロックにおいて、行列変換後の伝送信号の各チャネル（基本信号および補助信号）が独立して処理される。即ち、複数の多チャネル音響信号が行列変換により混在している伝送信号に対して、行列変換後のチャネル毎に、個別にマスキング曲線に基づくビット割当て処理が行われることになる。このため、行列変換後のチャネルによっては、特定の成分が残されたり削除されたりする現象が起こり、結果として、逆行列変換後に、元の多チャネル音響信号の成分を復元できなかったり、あるいは、相殺すべき成分がなくなったため雑音が発生するということが起こっていた。 For example, in the conventional AAC encoding shown in FIG. 3, each channel (basic signal and auxiliary signal) of the transmission signal after matrix conversion is processed independently in the transmission block. That is, a bit allocation process based on a masking curve is individually performed for each channel after matrix transformation on a transmission signal in which a plurality of multi-channel acoustic signals are mixed by matrix transformation. For this reason, depending on the channel after matrix transformation, a phenomenon may occur in which a specific component is left or deleted, and as a result, after the inverse matrix transformation, the original multi-channel acoustic signal component cannot be restored, or There was a problem that noise was generated because there was no component to be canceled.

したがって、かかる点に鑑みてなされた本発明の目的は、多チャネル音響方式の入力信号を行列変換した伝送信号を符号化する際に、伝送信号の各チャネル間の関連性に基づくビット割当てを行うことが可能な、多チャネル音響符号化装置およびそのプログラムを提供することである。 Accordingly, an object of the present invention made in view of such a point is to perform bit allocation based on the relationship between channels of a transmission signal when encoding a transmission signal obtained by performing matrix transformation on an input signal of a multi-channel acoustic system. It is to provide a multi-channel acoustic encoding device and a program thereof.

上述した諸課題を解決すべく、本発明に係る多チャネル音響符号化装置は、多チャネル音響方式の入力信号を行列変換により複数のチャネルの伝送信号に変換する行列変換部と、前記伝送信号のチャネル毎に、前記伝送信号に対するビット割当てを行う符号化部と、を備える多チャネル音響符号化装置であって、前記符号化部は、前記伝送信号のチャネル毎の聴覚モデルに基づき前記チャネル毎のスケールファクタを計算する計算部と、前記伝送信号の各チャネル間の相関係数を計算する相関計算部と、前記相関係数が閾値以上となる前記伝送信号のチャネル群に対し、前記チャネル群のスケールファクタに基づく適用スケールファクタを選択する選択部と、前記チャネル群の前記伝送信号に対し、前記適用スケールファクタによるビット割当てを行うビット割当部と、を備えることを特徴とする。 In order to solve the above-described problems, a multi-channel acoustic encoding apparatus according to the present invention includes a matrix conversion unit that converts a multi-channel acoustic input signal into a transmission signal of a plurality of channels by matrix conversion, A multi-channel acoustic encoding device comprising: an encoding unit that performs bit allocation for the transmission signal for each channel, wherein the encoding unit is configured to perform the channel-based audio model for each channel based on an auditory model for each channel of the transmission signal. A calculation unit that calculates a scale factor; a correlation calculation unit that calculates a correlation coefficient between the channels of the transmission signal; and a channel group of the transmission signal in which the correlation coefficient is equal to or greater than a threshold value. A selection unit that selects an applicable scale factor based on a scale factor; and bit allocation based on the applicable scale factor for the transmission signals of the channel group. A bit allocation unit which performs hand, in that it comprises the features.

また、前記相関計算部は、前記伝送信号の各チャネルのサブバンド毎の相関係数を計算し、前記選択部は、前記サブバンド毎に、前記相関係数に応じて前記適用スケールファクタの選択方法を変更する、ことが好ましい。 In addition, the correlation calculation unit calculates a correlation coefficient for each subband of each channel of the transmission signal, and the selection unit selects the applicable scale factor according to the correlation coefficient for each subband. It is preferable to change the method.

また、前記選択部は、過去の適用スケールファクタに基づく適用スケールファクタの平滑化と、隣接周波数帯のチャネル間における適用スケールファクタの平滑化との少なくとも一方の平滑化を行うことを特徴とすることが好ましい。 The selection unit may perform smoothing of at least one of smoothing of an applied scale factor based on a past applied scale factor and smoothing of an applied scale factor between channels in adjacent frequency bands. Is preferred.

上述したように本発明の解決手段を装置として説明してきたが、本発明はこれらに実質的に相当する方法、プログラム、プログラムを記録した記憶媒体としても実現し得るものであり、本発明の範囲にはこれらも包含されるものと理解されたい。 As described above, the solution of the present invention has been described as an apparatus. However, the present invention can be realized as a method, a program, and a storage medium storing the program, which are substantially equivalent thereto, and the scope of the present invention. It should be understood that these are also included.

例えば、本発明をプログラムとして実現した発明は、コンピュータに、多チャネル音響方式の入力信号を行列変換により複数のチャネルの伝送信号に変換する手順と、前記伝送信号のチャネル毎の聴覚モデルに基づき前記チャネル毎のスケールファクタを計算する手順と、前記伝送信号の各チャネル間の相関係数を計算する手順と、前記相関係数が閾値以上となる前記伝送信号のチャネル群に対し、前記チャネル群のスケールファクタに基づく適用スケールファクタを選択する手順と、前記チャネル群の前記伝送信号に対し、前記適用スケールファクタによるビット割当てを行う手順と、を実行させるものである。 For example, the invention that implements the present invention as a program is a computer that converts a multi-channel acoustic input signal into a transmission signal of a plurality of channels by matrix transformation and a hearing model for each channel of the transmission signal. A procedure for calculating a scale factor for each channel; a procedure for calculating a correlation coefficient between the channels of the transmission signal; and a channel group of the transmission signal for which the correlation coefficient is equal to or greater than a threshold. A procedure for selecting an applicable scale factor based on a scale factor and a procedure for assigning bits according to the applicable scale factor to the transmission signals of the channel group are executed.

本発明による多チャネル音響符号化装置およびそのプログラムによれば、多チャネル音響方式の入力信号を行列変換した伝送信号を符号化する際に、伝送信号の各チャネル間の関連性に基づくビット割当てを行うことが可能となる。 According to the multi-channel acoustic encoding apparatus and the program thereof according to the present invention, when encoding a transmission signal obtained by performing matrix conversion on an input signal of a multi-channel acoustic scheme, bit allocation based on the relationship between each channel of the transmission signal is performed. Can be done.

本発明の一実施形態に係る多チャネル音響符号化装置の機能ブロックを示す図である。It is a figure which shows the functional block of the multi-channel acoustic coding apparatus which concerns on one Embodiment of this invention. 符号化部の詳細な機能ブロックを示す図である。It is a figure which shows the detailed functional block of an encoding part. 多チャネルの入力信号を行列変換により２チャネルの伝送信号に変換してＡＡＣ伝送する処理の概要を示す図である。It is a figure which shows the outline | summary of the process which converts a multi-channel input signal into a 2-channel transmission signal by matrix transformation, and carries out AAC transmission.

以降、諸図面を参照しながら、本発明の実施態様を詳細に説明する。ここで、以下の説明においては、多チャネル音響方式として、スーパーハイビジョン用の音響方式である２２．２チャネル音響を例に説明を行うが、本発明は２２．２チャネル音響のみに限定されるものではない点に留意されたい。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Here, in the following description, the 22.2 channel sound which is the sound method for Super Hi-Vision will be described as an example of the multi-channel sound method, but the present invention is limited to only 22.2 channel sound. Note that this is not the case.

図１は、本発明の一実施形態に係る多チャネル音響符号化装置１の機能ブロック図である。多チャネル音響符号化装置１は、音響信号入力部１０と、行列変換部２０と、符号化部３０と、伝送部４０とを備える。 FIG. 1 is a functional block diagram of a multi-channel acoustic encoding apparatus 1 according to an embodiment of the present invention. The multi-channel acoustic encoding apparatus 1 includes an acoustic signal input unit 10, a matrix conversion unit 20, an encoding unit 30, and a transmission unit 40.

ここで、多チャネル音響符号化装置１として機能させるために、コンピュータを好適に用いることができ、そのようなコンピュータは、多チャネル音響符号化装置１の各機能を実現する処理内容を記述したプログラムを、当該コンピュータの記憶部（図示せず）に格納しておき、当該コンピュータの中央演算処理装置（ＣＰＵ）によってこのプログラムを読み出して実行させることで実現することができる。 Here, in order to function as the multi-channel acoustic encoding device 1, a computer can be suitably used, and such a computer is a program describing processing contents for realizing each function of the multi-channel acoustic encoding device 1. Is stored in a storage unit (not shown) of the computer, and this program is read and executed by a central processing unit (CPU) of the computer.

音響信号入力部１０は、入力される２２．２チャネル音響信号をＡ／Ｄ変換し、デジタル形式の音響信号を、入力信号として行列変換部２０に出力する。 The acoustic signal input unit 10 performs A / D conversion on the input 22.2 channel acoustic signal and outputs a digital acoustic signal to the matrix conversion unit 20 as an input signal.

行列変換部２０は、２２．２チャネル音響信号である入力信号を行列変換によってよりチャネル数の少ない伝送信号にダウンミックスする。行列変換部２０は、行列変換後の伝送信号を符号化部３０に出力する。 The matrix converter 20 downmixes the input signal, which is a 22.2 channel acoustic signal, into a transmission signal with a smaller number of channels by matrix conversion. The matrix conversion unit 20 outputs the transmission signal after the matrix conversion to the encoding unit 30.

符号化部３０は、伝送信号のチャネル毎に伝送信号へのビット割当てを行う。図２は、符号化部３０の詳細な機能ブロックを示す図である。符号化部３０は、マスキング曲線計算部３１と、サブバンドスケールファクタ計算部３２と、サブバンド分割部３３と、チャネル間相関計算部３４と、スケールファクタ選択部３５と、ビット割当部３６とを備える。ここで、マスキング曲線計算部３１と、サブバンドスケールファクタ計算部３２と、サブバンド分割部３３と、ビット割当部３６とは、伝送信号のチャネル毎に独立した構成とすることも可能であり、説明の便宜上、図２の各チャネルの機能ブロックには符号にチャネル番号（例えばａ〜ｃ）付して記載している。なお、図２においては、ｂチャネルおよびｃチャネルに対応する機能ブロックを概略的に記載しているが、ｂチャネルおよびｃチャネルに対しても、ａチャネル同様に、マスキング曲線計算部３１と、サブバンドスケールファクタ計算部３２と、サブバンド分割部３３ａ、ビット割当部３６とを構成できる点に留意されたい。 The encoding unit 30 assigns bits to the transmission signal for each channel of the transmission signal. FIG. 2 is a diagram illustrating detailed functional blocks of the encoding unit 30. The encoding unit 30 includes a masking curve calculation unit 31, a subband scale factor calculation unit 32, a subband division unit 33, an inter-channel correlation calculation unit 34, a scale factor selection unit 35, and a bit allocation unit 36. Prepare. Here, the masking curve calculation unit 31, the subband scale factor calculation unit 32, the subband division unit 33, and the bit allocation unit 36 can be configured independently for each channel of the transmission signal. For convenience of explanation, channel numbers (for example, a to c) are attached to the functional blocks of each channel in FIG. In FIG. 2, functional blocks corresponding to the b channel and the c channel are schematically described, but the masking curve calculation unit 31 and the sub channel are also applied to the b channel and the c channel as in the a channel. It should be noted that the band scale factor calculation unit 32, the subband division unit 33a, and the bit allocation unit 36 can be configured.

マスキング曲線計算部３１は、伝送信号の各チャネルの聴覚モデルを計算する。ここで、各チャネルの聴覚モデルとは、「エネルギーの大きな信号の近傍周波数帯域では比較的小さいエネルギーを有する音は聞き取ることができない」という聴覚特性に基づくマスキング曲線や、「周波数毎に聞き取ることのできないエネルギーの音がある」という聴覚特性に基づく最小可聴曲線などに基づいて計算されるものである。なお、マスキング曲線や最小可聴曲線の計算は当業者にとって公知であるため、本稿において詳述は行わない。これ以降、説明の便宜上、聴覚モデルとしてマスキング曲線を用いる場合を例に説明を行うが、本発明はこれに限定されない。マスキング曲線計算部３１は、計算した各チャネルのマスキング曲線をサブバンドスケールファクタ計算部３２に出力する。 The masking curve calculation unit 31 calculates an auditory model of each channel of the transmission signal. Here, the auditory model of each channel is a masking curve based on an auditory characteristic that “a sound having relatively low energy cannot be heard in a frequency band near a high-energy signal” It is calculated based on the minimum audible curve based on the auditory characteristic that there is a sound of energy that cannot be performed. Note that the calculation of the masking curve and the minimum audible curve is known to those skilled in the art and will not be described in detail in this article. Hereinafter, for convenience of explanation, a case where a masking curve is used as an auditory model will be described as an example, but the present invention is not limited to this. The masking curve calculation unit 31 outputs the calculated masking curve of each channel to the subband scale factor calculation unit 32.

サブバンドスケールファクタ計算部３２は、各チャネルのマスキング曲線から、各チャネルのサブバンド毎のスケールファクタを計算する。具体的には、サブバンドスケールファクタ計算部３２は、各チャネルのサブバンドの信号から、最大絶対値をもつサンプルを探し、その値を対数に変換して量子化したものをスケールファクタとして取得する。なお、サブバンド毎のスケールファクタの計算は当業者にとって公知であるため、本稿において詳述は行わない。サブバンドスケールファクタ計算部３２は、計算した各チャネルのサブバンド毎のスケールファクタをスケールファクタ選択部３５に出力する。 The subband scale factor calculation unit 32 calculates a scale factor for each subband of each channel from the masking curve of each channel. Specifically, the subband scale factor calculation unit 32 searches the subband signal of each channel for a sample having the maximum absolute value, and obtains a scale factor obtained by converting the value into a logarithm. . Note that the calculation of the scale factor for each subband is known to those skilled in the art, and will not be described in detail in this article. The subband scale factor calculation unit 32 outputs the calculated scale factor for each subband of each channel to the scale factor selection unit 35.

サブバンド分割部３３は、伝送信号のチャネル毎に、伝送信号をサブバンドに分割し、チャネル間相関計算部３４に出力する。 The subband division unit 33 divides the transmission signal into subbands for each channel of the transmission signal and outputs the subband to the interchannel correlation calculation unit 34.

チャネル間相関計算部３４は、伝送信号の各チャネル間の相関係数を計算する。特に、チャネル間相関計算部３４は、サブバンド毎に、各チャネルの一定の時間区間の伝送信号の変化から、各チャネル間の相関係数を計算する。即ち、一定の時間区間における伝送信号の変化の仕方が近いチャネル間の相関係数は高く、変化の仕方が異なるチャネル間の相関係数は低く計算される。チャネル間相関計算部３４は、計算した相関係数をスケールファクタ選択部３５に供給する。 The inter-channel correlation calculation unit 34 calculates a correlation coefficient between each channel of the transmission signal. In particular, the inter-channel correlation calculation unit 34 calculates the correlation coefficient between the respective channels from the change in the transmission signal in a certain time interval of each channel for each subband. That is, the correlation coefficient between channels whose transmission signal changes in a certain time interval is close, and the correlation coefficient between channels whose change is different is calculated low. The inter-channel correlation calculation unit 34 supplies the calculated correlation coefficient to the scale factor selection unit 35.

スケールファクタ選択部３５は、相関係数が一定値以上となるチャネル群の伝送信号に対し、当該伝送信号のサブバンド毎のスケールファクタに基づく適用スケールファクタを選択する。ここで、適用スケールファクタとは、相関係数が一定値以上となるチャネル群の伝送信号のビット割当てに用いられるスケールファクタを意味する。 The scale factor selection unit 35 selects an applicable scale factor based on a scale factor for each subband of the transmission signal for a transmission signal of a channel group having a correlation coefficient equal to or greater than a certain value. Here, the applied scale factor means a scale factor used for bit allocation of a transmission signal of a channel group in which a correlation coefficient is a certain value or more.

例えば、スケールファクタ選択部３５は、式１のように、サブバンド毎に、チャネル群に属する各チャネルのスケールファクタの最大値を選択し、適用スケールファクタとすることができる。ここで、Ｇ（ｋ）はｋ番目のサブバンドに対する適用スケールファクタを表し、Ｆ_ｊ（ｋ）は、チャネル群に属するｊ番目のチャネルのｋ番目のサブバンドのスケールファクタを表す。なお、この場合、適用スケールファクタは、チャネル群に属する各チャネルの伝送信号に対して共通となる。適用スケールファクタとして、各チャネルのスケールファクタの最大値を合成すると、量子化の粒度が最も細かくなるため、量子化精度は向上するが、量子化効率は低下する。 For example, the scale factor selection unit 35 can select the maximum value of the scale factor of each channel belonging to the channel group for each subband as shown in Equation 1, and set it as the applicable scale factor. Here, G (k) represents the applicable scale factor for the kth subband, and _Fj (k) represents the scale factor of the kth subband of the jth channel belonging to the channel group. In this case, the applicable scale factor is common to the transmission signals of the channels belonging to the channel group. When the maximum scale factor of each channel is combined as the applied scale factor, the quantization granularity becomes the finest, so that the quantization accuracy is improved, but the quantization efficiency is lowered.

また、スケールファクタ選択部３５は、式２のように、サブバンド毎に、チャネル群に属する各チャネルのスケールファクタの最小値を選択し、適用スケールファクタとすることができる。なお、この場合も、適用スケールファクタは、チャネル群に属する各チャネルの伝送信号に対して共通となる。適用スケールファクタとして、各チャネルのスケールファクタの最小値を合成すると、量子化の粒度が最も荒くなるため、量子化精度は低下するが、量子化効率は向上する。 Also, the scale factor selection unit 35 can select the minimum value of the scale factor of each channel belonging to the channel group for each subband, as Equation 2, and set it as the applicable scale factor. Also in this case, the applicable scale factor is common to the transmission signals of the respective channels belonging to the channel group. When the minimum value of the scale factor of each channel is synthesized as the applied scale factor, the quantization granularity becomes the roughest, and thus the quantization accuracy is lowered, but the quantization efficiency is improved.

さらに、スケールファクタ選択部３５は、式３のように、チャネル間の相関係数に基づき各チャネルの適用スケールファクタを選択することができる。式３において、Ｇ_ｉ（ｋ）は、ｋ番目のサブバンドにおけるｉ番目のチャネルの伝送信号に対する適用スケールファクタを表し、ｒ_ｉｊ（ｋ）はｋ番目のサブバンドにおけるｉ番目のチャネルとｊ番目のチャネルとの相関係数を表す。この場合、適用スケールファクタは、伝送信号のチャネル毎に異なる値となる。適用スケールファクタとして、各チャネル間の相関係数に基づき各スケールファクタを合成すると、チャネル間の量子化精度の違いを押さえながら、各チャネルに沿った量子化を行うことができ、量子化効率を向上させることができる。 Furthermore, the scale factor selection unit 35 can select the applicable scale factor of each channel based on the correlation coefficient between channels as shown in Equation 3. In Equation 3, G _i (k) represents an applied scale factor for the transmission signal of the i th channel in the k th subband, and r _ij (k) represents the i th channel and the j th channel in the k th subband. Represents the correlation coefficient with the channel. In this case, the applicable scale factor has a different value for each channel of the transmission signal. By combining each scale factor based on the correlation coefficient between each channel as an applicable scale factor, it is possible to perform quantization along each channel while suppressing the difference in quantization accuracy between channels, and to improve the quantization efficiency. Can be improved.

さらに、スケールファクタ選択部３５は、サブバンド毎に、相関係数に応じて適用スケールファクタの選択方法を変更することができる。具体的には、予め閾値Ｔｈ_１およびＴｈ_２（Ｔｈ_１＞Ｔｈ_２）を設定し、式４に示すとおり、あるサブバンドｋにおける相関係数が閾値Ｔｈ_１より高い場合には、各チャネル間に共通の適用スケールファクタを用いるものとして、式（１）と同様に適用スケールファクタを選択し、相関係数が閾値Ｔｈ_２より高く閾値Ｔｈ_１以下である場合には、各チャネル間の相関度を考慮した適用スケールファクタを用いるものとして、式（３）と同様に適用スケールファクタを選択する。なお、相関係数が閾値Ｔｈ_２未満の場合には、各チャネル間の相関は低いものとして、各チャネルのスケールファクタをそのまま各チャネルのビット割当てに用いるものとする。 Furthermore, the scale factor selection unit 35 can change the selection method of the applied scale factor for each subband according to the correlation coefficient. Specifically, threshold values Th ₁ and Th ₂ (Th ₁ > Th ₂ ) are set in advance, and when the correlation coefficient in a certain subband k is higher than the threshold value Th ₁ as shown in Equation 4, between each channel If the application scale factor is selected in the same manner as in the equation (1) and the correlation coefficient is higher than the threshold Th _{2 and} equal to or lower than the threshold Th ₁ , the degree of correlation between the channels is used. As in the equation (3), the applicable scale factor is selected as an application scale factor that considers Incidentally, when the correlation coefficient is less than the threshold Th ₂ is a correlation as low between channels, and those using the scale factor for each channel as it is a bit allocation for each channel.

適用スケールファクタの選択は式（１）〜（４）に限定されず、例えば、スケールファクタ選択部３５は、各チャネルのスケールファクタの平均値などを適用スケールファクタとして選択することができる。 The selection of the applicable scale factor is not limited to the formulas (1) to (4). For example, the scale factor selection unit 35 can select the average value of the scale factors of each channel as the applied scale factor.

さらに、スケールファクタ選択部３５は、過去の適用スケールファクタに基づく適用スケールファクタの平滑化や、隣接周波数帯のチャネル間における適用スケールファクタの平滑化など、スケールファクタに関する種々の平滑化を行うことができる。 Furthermore, the scale factor selection unit 35 can perform various smoothing related to the scale factor, such as smoothing of the applied scale factor based on the past applied scale factor, and smoothing of the applied scale factor between channels in adjacent frequency bands. it can.

スケールファクタ選択部３５は、相関係数が一定値以上となるチャネル群の伝送信号に対しては、式（１）〜（４）などにより求めた適用スケールファクタをビット割当部３６に出力し、相関係数が一定値未満のチャネルに対しては、各チャネルのスケールファクタをそのままビット割当部３６に出力する。 The scale factor selection unit 35 outputs the applied scale factor obtained by the equations (1) to (4) to the bit allocation unit 36 for the transmission signals of the channel group in which the correlation coefficient is a certain value or more, For channels with a correlation coefficient less than a certain value, the scale factor of each channel is output to the bit allocation unit 36 as it is.

ビット割当部３６は、スケールファクタ選択部３５からの適用スケールファクタ又はスケールファクタに基づき、伝送信号のチャネル毎に伝送信号へのビット割当てを行う。ビット割当部３６は、符号化した伝送信号を伝送部６０に出力する。 The bit allocation unit 36 performs bit allocation to the transmission signal for each channel of the transmission signal based on the applied scale factor or the scale factor from the scale factor selection unit 35. The bit allocation unit 36 outputs the encoded transmission signal to the transmission unit 60.

伝送部６０は、符号化部５０により符号化された伝送信号を受信側に送信する。 The transmission unit 60 transmits the transmission signal encoded by the encoding unit 50 to the reception side.

このように、本実施形態によれば、チャネル間相関計算部３４は、伝送信号の各チャネル間の相関係数を計算し、スケールファクタ選択部３５は、相関係数が閾値以上となる伝送信号のチャネル群に対し、当該チャネル群のスケールファクタに基づく適用スケールファクタを選択し、ビット割当部３６は、当該チャネル群の伝送信号に対し、適用スケールファクタによるビット割当てを行う。これにより、多チャネル音響方式の入力信号を行列変換した伝送信号を符号化する際に、伝送信号の各チャネル間の関連性に基づくビット割当てを行うことが可能となる。例えば、相関係数が一定値以上となるチャネル群の伝送信号に対しては、各チャネルのスケールファクタの最小値又は最大値から、各チャネルに共通の適用スケールファクタを用いてビット割当てを行うことにより、量子化精度や量子化効率を要求に応じて調整することが可能となる。このため、量子化精度を高めた場合などは、従来問題となった逆行列変換による信号の消失や雑音の発生を低減させることができる。特に、例えば、２２．２チャネル音響信号を、主要な空間情報を表す８〜１０チャネルの基本信号と、元の信号を復元するための補助信号に分離して伝送し、主要な空間情報に対しては高いビットレートを割当て、補助信号に対してはビットレートを抑制するような場合に対しても、量子化雑音の少ない符号化・伝送が可能となる。また、例えば、スーパーハイビジョン用の２２．２チャネル音響信号を、２チャネルステレオ信号や５．１チャネル信号として伝送する場合においても、高品質かつ１／１０程度のビットレートで符号化・伝送することが可能となる。 Thus, according to the present embodiment, the inter-channel correlation calculation unit 34 calculates the correlation coefficient between the channels of the transmission signal, and the scale factor selection unit 35 transmits the transmission signal whose correlation coefficient is equal to or greater than the threshold value. For this channel group, an applied scale factor based on the scale factor of the channel group is selected, and the bit allocation unit 36 performs bit allocation based on the applied scale factor for the transmission signal of the channel group. Accordingly, when encoding a transmission signal obtained by performing matrix conversion on an input signal of a multi-channel acoustic system, it is possible to perform bit allocation based on the relationship between the channels of the transmission signal. For example, for a transmission signal of a channel group in which the correlation coefficient is equal to or greater than a certain value, bit allocation is performed from the minimum value or maximum value of the scale factor of each channel using an applicable scale factor common to each channel. As a result, it is possible to adjust the quantization accuracy and the quantization efficiency as required. For this reason, when the quantization accuracy is increased, it is possible to reduce signal loss and noise generation due to inverse matrix transformation, which has been a problem in the past. In particular, for example, a 22.2 channel acoustic signal is transmitted by being separated into an 8-10 channel basic signal representing main spatial information and an auxiliary signal for restoring the original signal. Therefore, even when a high bit rate is assigned and the bit rate is suppressed for the auxiliary signal, encoding / transmission with less quantization noise is possible. In addition, for example, even when transmitting 22.2 channel audio signals for Super Hi-Vision as 2 channel stereo signals or 5.1 channel signals, encoding and transmission with high quality and a bit rate of about 1/10. Is possible.

また、本実施形態によれば、チャネル間相関計算部３４は、伝送信号の各チャネルのサブバンド毎の相関係数を計算し、スケールファクタ選択部３５は、サブバンド毎に、相関係数に応じて適用スケールファクタの選択方法を変更する。これにより、相関係数の高いサブバンドに関しては共通のスケールファクタを用い、相関係数が中程度のサブバンドに関しては相関係数に基づくスケールファクタを設定できるなど、より細かい粒度で量子化精度および量子化効率を制御することが可能となる。 Further, according to the present embodiment, the inter-channel correlation calculation unit 34 calculates the correlation coefficient for each subband of each channel of the transmission signal, and the scale factor selection unit 35 calculates the correlation coefficient for each subband. Change the application scale factor selection method accordingly. As a result, a common scale factor can be used for subbands with a high correlation coefficient, and a scale factor based on the correlation coefficient can be set for subbands with a medium correlation coefficient. It becomes possible to control the quantization efficiency.

また、本実施形態によれば、スケールファクタ選択部３５は、過去の適用スケールファクタに基づく適用スケールファクタの平滑化や、隣接周波数帯のチャネル間における適用スケールファクタの平滑化など、スケールファクタに関する種々の平滑化を行うことができる。これにより、隣接する時間帯、周波数帯におけるスケールファクタの差異が低減され、急激なスケールファクタの変化による信号の消失や雑音の発生を低減させることができる。 In addition, according to the present embodiment, the scale factor selection unit 35 performs various types of scale factors such as smoothing of an applied scale factor based on past applied scale factors and smoothing of an applied scale factor between channels in adjacent frequency bands. Can be smoothed. As a result, the difference in scale factor between adjacent time zones and frequency bands is reduced, and the loss of signals and the occurrence of noise due to a sudden change in scale factor can be reduced.

本発明を諸図面や実施例に基づき説明してきたが、当業者であれば本開示に基づき種々の変形や修正を行うことが容易であることに注意されたい。従って、これらの変形や修正は本発明の範囲に含まれることに留意されたい。例えば、各部材、各手段、各ステップなどに含まれる機能などは論理的に矛盾しないように再配置可能であり、複数の手段やステップなどを１つに組み合わせたり、或いは分割したりすることが可能である。 Although the present invention has been described based on the drawings and examples, it should be noted that those skilled in the art can easily make various modifications and corrections based on the present disclosure. Therefore, it should be noted that these variations and modifications are included in the scope of the present invention. For example, functions included in each member, each means, each step, etc. can be rearranged so as not to be logically contradictory, and a plurality of means, steps, etc. can be combined or divided into one. Is possible.

例えば、上記実施形態では、多チャネル音響方式の入力信号を２２．２チャネル音響、行列変換後の伝送信号を２チャネルステレオ信号として説明を行ったが、本発明は、他の５．１チャネル音響、７．１チャネル音響など任意の多チャネル音響方式であって、符号化伝送に関し信号の行列変換を伴う処理全般に適応可能なことは言うまでもない。また、行列変換などの線形処理を伴う信号として、アンビソニックス等の音響信号に対しても適用可能である。 For example, in the above embodiment, the input signal of the multi-channel acoustic system is described as 22.2 channel sound, and the transmission signal after matrix conversion is described as 2 channel stereo signal. However, the present invention is not limited to other 5.1 channel sound. It is needless to say that any multi-channel sound system such as 7.1-channel sound can be applied to all processes involving signal matrix transformation with respect to coded transmission. Moreover, it is applicable also to acoustic signals, such as an ambisonics, as a signal accompanied by linear processes, such as matrix transformation.

また、上記実施形態では、符号化方式としてＡＡＣを例に説明をしたが、本発明におけるＡＡＣ符号化とは、ＭＰＥＧ２−ＡＡＣ、ＭＰＥＧ４−ＡＡＣ、ＨＥ−ＡＡＣなど、ＡＡＣに関するあらゆるバージョンを包括するものである。また、本発明が対応可能な符号化はＡＡＣに限定されず、人間の聴覚特性に基づいて高品質に符号化する方式であれば、任意の符号化方式に対応可能である。 In the above embodiment, AAC has been described as an example of the encoding method, but AAC encoding in the present invention includes all versions of AAC such as MPEG2-AAC, MPEG4-AAC, HE-AAC. It is. In addition, the encoding that can be supported by the present invention is not limited to AAC, and any encoding scheme can be used as long as it encodes with high quality based on human auditory characteristics.

本発明によれば、多チャネル音響方式の入力信号を行列変換した伝送信号を符号化する際に、伝送信号の各チャネル間の関連性に基づくビット割当てを行うことが可能となるという有用性がある。 According to the present invention, when encoding a transmission signal obtained by performing matrix conversion on an input signal of a multi-channel acoustic system, it is possible to perform bit allocation based on the relationship between each channel of the transmission signal. is there.

１多チャネル音響符号化装置
１０音響信号入力部
２０行列変換部
３０符号化部
３１マスキング曲線計算部
３２サブバンドスケールファクタ計算部（計算部）
３３サブバンド分割部
３４チャネル間相関計算部（相関計算部）
３５スケールファクタ選択部（選択部）
３６ビット割当部
４０伝送部
DESCRIPTION OF SYMBOLS 1 Multichannel acoustic coding apparatus 10 Acoustic signal input part 20 Matrix transformation part 30 Coding part 31 Masking curve calculation part 32 Subband scale factor calculation part (calculation part)
33 Subband division unit 34 Interchannel correlation calculation unit (correlation calculation unit)
35 Scale factor selection part (selection part)
36 bit allocation unit 40 transmission unit

Claims

A matrix converter that converts a multi-channel acoustic input signal into a transmission signal of a plurality of channels by matrix conversion;
A multi-channel acoustic encoding device comprising: an encoding unit that performs bit allocation to the transmission signal for each channel of the transmission signal;
The encoding unit includes:
A calculation unit for calculating a scale factor for each channel based on an auditory model for each channel of the transmission signal;
A correlation calculator for calculating a correlation coefficient between each channel of the transmission signal;
A selection unit that selects an applied scale factor based on a scale factor of the channel group for a channel group of the transmission signal in which the correlation coefficient is equal to or greater than a threshold;
A multi-channel acoustic coding apparatus comprising: a bit allocation unit that performs bit allocation based on the applied scale factor for the transmission signals of the channel group.

The correlation calculation unit calculates a correlation coefficient for each subband of each channel of the transmission signal,
The multi-channel acoustic encoding apparatus according to claim 1, wherein the selection unit changes a selection method of the applied scale factor according to the correlation coefficient for each subband.

The selection unit performs smoothing of at least one of smoothing of an applied scale factor based on a past applied scale factor and smoothing of an applied scale factor between channels in adjacent frequency bands. Or the multi-channel acoustic encoding apparatus according to 2;

On the computer,
A procedure for converting a multi-channel acoustic input signal into a transmission signal of a plurality of channels by matrix conversion,
Calculating a scale factor for each channel based on an auditory model for each channel of the transmitted signal;
Calculating a correlation coefficient between each channel of the transmission signal;
Selecting an applicable scale factor based on a scale factor of the channel group for the channel group of the transmission signal in which the correlation coefficient is equal to or greater than a threshold;
A program for executing a procedure for performing bit allocation according to the applied scale factor for the transmission signals of the channel group.