JP2007221216A

JP2007221216A - Mix-down method and apparatus

Info

Publication number: JP2007221216A
Application number: JP2006036443A
Authority: JP
Inventors: Hitoshi Ishida; 斉石田
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2006-02-14
Filing date: 2006-02-14
Publication date: 2007-08-30
Anticipated expiration: 2026-02-14
Also published as: JP4997781B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a mix-down method and a mix-down apparatus capable of executing mix-down with an excellent sound volume balance. <P>SOLUTION: The mix-down apparatus multiplies a preset static gain value with a scale factor of a received acoustic signal to output a value of the multiplication result corresponding to each acoustic signal, and calculates a dynamic gain value corresponding to each acoustic signal by normalizing the respective values of the multiplication results, then multiplies the dynamic gain value by each normalized acoustic signal with a plurality of received acoustic signals and summates a plurality of multiplication results to calculate an output signal of an output system. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、ディジタルオーディオに用いるミックスダウンのための各チャンネルのゲインを動的に決定するミックスダウン方法およびミックスダウン装置に関する。 The present invention relates to a mixdown method and a mixdown apparatus that dynamically determine the gain of each channel for mixdown used in digital audio.

近年、ＡＶ(オーディオ・ビジュアル)分野の音響再生においては、従来のＬ／Ｒチャンネルのステレオ信号にセンタチャンネルやサラウンドチャンネルを加えたマルチチャンネル再生方式が実現されている。このマルチチャンネル再生方式において原音場を再生する場合、視聴者の後方または側方に少なくとも１つのスピーカを配置する必要があった。 In recent years, in the audio reproduction in the AV (audio / visual) field, a multi-channel reproduction method in which a center channel and a surround channel are added to a conventional L / R channel stereo signal has been realized. When reproducing the original sound field in this multi-channel reproduction system, it is necessary to arrange at least one speaker behind or on the side of the viewer.

しかしながら、マルチチャンネル再生方式の音響信号をＬ／Ｒチャンネルの２つのスピーカで再生したいという視聴者の要望に対応するために、マルチチャンネルで伝送された音響信号を２チャンネルに変換する必要が生じる。 However, in order to respond to the viewer's desire to reproduce a multi-channel playback system sound signal with two L / R channel speakers, it is necessary to convert the multi-channel sound signal into two channels.

そこで、マルチチャンネルの入力音源に対して、入力側のチャンネル数よりも少ないチャンネル数に音響信号を変換して出力するというミックスダウンが必要になる。 Therefore, it is necessary to mix down the multi-channel input sound source by converting the acoustic signal into a smaller number of channels than the number of channels on the input side and outputting it.

このミックスダウンの手法の一例として、特許文献１にマルチチャンネルステレオ用ダウンミキシング装置が報告されている。 As an example of this mixdown technique, Patent Document 1 reports a multichannel stereo downmixing apparatus.

この特許文献１によれば、ミックスダウンされるマルチチャンネルの音響信号が音楽であるか音声であるかを、入力されたＬ／Ｒｃｈ信号を加算してこの加算されたＬ／Ｒｃｈ信号に基づいて判別している。そして、その判別結果に基づいて、音量調整部のゲイン係数を変更することで、音楽再生の場合には臨場感を実現できる値に、また音声の場合には明瞭感が確保できる値にそれぞれ修正されており、２チャンネルのスピーカ再生においても、それぞれのプログラムソースに最適な音響再生を実現することが可能となるという利点を有している。
特開平６−１６５０７９号公報 According to Patent Document 1, whether the multi-channel acoustic signal to be mixed down is music or voice is added based on the added L / Rch signal by adding the input L / Rch signal. Judging. Based on the determination result, the gain coefficient of the volume adjustment unit is changed to a value that enables realism in the case of music playback and a value that can ensure clarity in the case of audio. Therefore, even in 2-channel speaker reproduction, there is an advantage that it is possible to realize optimum sound reproduction for each program source.
JP-A-6-165079

このように、特許文献１によれば、入力されたＬ／Ｒｃｈ信号に基づいて音楽であるか音声であるかを判別して音量調整のためのゲイン係数を変更することで、音楽再生と音声再生とで適切な音場再生を実現している。 As described above, according to Patent Document 1, it is determined whether music is sound or sound based on the input L / Rch signal, and the gain coefficient for volume adjustment is changed, so that music reproduction and sound can be performed. Appropriate sound field reproduction is realized.

しかしながら、特許文献１にあっては、ミックスダウンのためのゲイン係数を予め２つ用意しておき、音楽再生の場合と音声再生の場合とで切り換えるようにしていたので、様々な音響信号に最適に適応できないといった問題があった。この結果、音場再生の切り替わり時期が明確になり過ぎチャンネル間での音量バランスが悪く違和感や疲労感の要因になっていた。 However, in Patent Document 1, since two gain coefficients for mixdown are prepared in advance and switched between music playback and voice playback, it is optimal for various acoustic signals. There was a problem that could not be adapted to. As a result, the switching time of the sound field reproduction became clear, and the volume balance between channels was poor, causing a sense of incongruity and fatigue.

本発明は、上記に鑑みてなされたもので、その目的としては、音量バランスの良いミックスダウンを行うことができるミックスダウン方法およびミックスダウン装置を提供することにある。 The present invention has been made in view of the above, and an object thereof is to provide a mixdown method and a mixdown apparatus capable of performing a mixdown with a good volume balance.

請求項１記載の発明は、上記課題を解決するため、入力される複数の音響信号を混合してこの音響信号の数よりも少ない数の出力信号に変換するミックスダウン装置において、前記各音響信号のスケールファクタに基づいてこの音響信号の動的ゲイン値を決定するゲイン決定手段と、前記音響信号とこの音響信号に対して決定された動的ゲイン値とを乗算し、複数の音響信号に対応する乗算結果値の総和値を出力信号として出力するミックスダウン手段とを備えたことを特徴とする。 In order to solve the above-mentioned problem, the invention according to claim 1 is a mixdown device that mixes a plurality of input sound signals and converts them into a number of output signals smaller than the number of sound signals. The gain determination means for determining the dynamic gain value of the sound signal based on the scale factor of the sound signal, and the sound signal and the dynamic gain value determined for the sound signal are multiplied to support a plurality of sound signals. And a mixdown means for outputting the sum of the multiplication result values as an output signal.

請求項３記載の発明は、上記課題を解決するため、入力される複数の音響信号を混合してこの音響信号の数よりも少ない数の出力信号に変換するミックスダウン方法において、前記各音響信号のスケールファクタに基づいてこの音響信号の動的ゲイン値を決定するゲイン決定ステップと、前記音響信号とこの音響信号に対して決定された動的ゲイン値とを乗算し、複数の音響信号に対応する乗算結果値の総和値を出力信号として出力するミックスダウンステップとを有することを特徴とする。 According to a third aspect of the present invention, there is provided a mixdown method for mixing a plurality of input sound signals and converting them into a number of output signals smaller than the number of the sound signals in order to solve the above-mentioned problem. The gain determination step for determining the dynamic gain value of the acoustic signal based on the scale factor of the acoustic signal, and the acoustic signal multiplied by the dynamic gain value determined for the acoustic signal, to support a plurality of acoustic signals And a mixdown step of outputting the sum of the multiplication result values to be output as an output signal.

本発明のミックスダウン方法およびミックスダウン装置によれば、入力される音響信号のスケールファクタに基づいて動的ゲイン値を決定するので、音量バランスの良いミックスダウンを行うことができる。 According to the mixdown method and the mixdown apparatus of the present invention, the dynamic gain value is determined based on the scale factor of the input acoustic signal, so that the mixdown with a good volume balance can be performed.

以下、本発明の実施の形態について図面を参照して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（第１実施形態）
ＭＰＥＧ−２／４ＡＡＣは、ＭＰＥＧ−１オーディオとの互換性を排除することで、高音質・高圧縮率を達成したマルチチャンネル対応のオーディオ符号化方式である。入力のサンプリング周波数は8kHzから96kHzと、非常に広い範囲に対応している。また最大で、48チャンネルのオーディオ信号、15個のＬＦＥ（Low Frequency Enhancement）チャンネル、カップリングチャンネル、汎用データストリームを伝送できる。 (First embodiment)
MPEG-2 / 4 AAC is a multi-channel audio encoding method that achieves high sound quality and high compression rate by eliminating compatibility with MPEG-1 audio. The sampling frequency of the input corresponds to a very wide range from 8kHz to 96kHz. In addition, it can transmit up to 48 channels of audio signals, 15 LFE (Low Frequency Enhancement) channels, coupling channels, and general-purpose data streams.

音声復号装置は、音声符号化装置に対向するものであり、音声符号化装置から出力される符号化されたビットストリームｂｓや、例えばＤＶＤに記憶されているビットストリームｂｓや、インターネットを介して入力されるビットストリームｂｓを復号してオーディオ信号に再生するものである。 The audio decoding device is opposed to the audio encoding device, and is input via an encoded bit stream bs output from the audio encoding device, a bit stream bs stored in a DVD, for example, or the Internet. The decoded bit stream bs is decoded and reproduced as an audio signal.

図１は、本発明に係る第１実施形態のミックスダウン装置に適応可能な音声復号装置について説明するための構成を示すブロック図である。図２は、ゲイン決定部１９、時間周波数逆変換部２１及びミックスダウン部２３の相互の接続関係を示す図である。なお、ここでいうミックスダウン装置とは、入力される複数の音響信号を混合してこの音響信号の数よりも少ない数の出力信号に変換する装置のことである。 FIG. 1 is a block diagram showing a configuration for explaining a speech decoding apparatus applicable to the mixdown apparatus according to the first embodiment of the present invention. FIG. 2 is a diagram illustrating a mutual connection relationship between the gain determination unit 19, the time-frequency inverse conversion unit 21, and the mixdown unit 23. Note that the mixdown device referred to here is a device that mixes a plurality of input acoustic signals and converts them into a smaller number of output signals than the number of acoustic signals.

音声復号装置１１は、シンタックス解読部１３、ハフマン符号デコーダ部１５、逆量子化部１７、ゲイン決定部１９、時間周波数逆変換部２１、ミックスダウン部２３から構成され、さらに、各部はＤＳＰ（Digital Signal Processor）やソフトウエア処理により構成されている。
シンタックス解読部１３は、入力された符号化データ列ｂｓ（ビットストリーム）から高能率符号化音響ストリームと帯域信号や正規化情報及び量子化精度情報などを規定の文法に従って分離する。 The speech decoding apparatus 11 includes a syntax decoding unit 13, a Huffman code decoder unit 15, an inverse quantization unit 17, a gain determination unit 19, a time-frequency inverse transform unit 21, and a mixdown unit 23. Further, each unit is a DSP ( Digital Signal Processor) and software processing.
The syntax decoding unit 13 separates a high-efficiency encoded audio stream and a band signal, normalization information, quantization accuracy information, and the like from the input encoded data string bs (bit stream) according to a prescribed grammar.

なお、帯域信号は、音声符号化装置において、元データを複数の周波数帯域に分割し、帯域ごとに人間の聴覚に応じた重みつけをした後に符号化されている。また、正規化情報は、後述する逆量子化処理において用いる係数位置を揃えるための情報である。量子化精度情報は、後述する逆量子化処理において用いる階層化されたDCT係数の量子化精度（レベル）の情報である。
ハフマン符号デコーダ部１５は、入力された高能率符号化音響ストリームｈを副情報で得られた符号化フォーマットに従ってハフマン復号し、量子化スペクトルｑ１〜ｑ６とスケールファクタを出力する。 Note that the band signal is encoded after the original data is divided into a plurality of frequency bands and weighted according to human hearing for each band in the speech encoding apparatus. Further, the normalization information is information for aligning coefficient positions used in an inverse quantization process described later. The quantization accuracy information is information on the quantization accuracy (level) of the hierarchized DCT coefficient used in the inverse quantization process described later.
The Huffman code decoder unit 15 performs Huffman decoding on the input high-efficiency encoded audio stream h according to the encoding format obtained from the sub information, and outputs quantized spectra q1 to q6 and a scale factor.

逆量子化部１７は、復号された量子化スペクトルｑ１〜ｑ６、スケールファクタに基づいて、元の周波数信号ｆ１〜ｆ６を再生する。 The inverse quantization unit 17 reproduces the original frequency signals f1 to f6 based on the decoded quantized spectra q1 to q6 and the scale factor.

なお、スケールファクタは量子化における量子化幅を示す。例えばＭＰＥＧ−２／４ＡＡＣでは、ある周波数信号ｆに対し、量子化スペクトルｑを、
q = INT((f^3/4) / 2^(sf/4) + 0.4054) （数１）
と計算し、この時の整数ｓｆをスケールファクタと呼ぶ。スケールファクタは連続した複数の周波数信号に対し共通的に用いられ、低域側から順にｓｆ１（１）、ｓｆ１（２）．．．ｓｆ１（Ｎ）と書き表すものとする。本実施形態においては、最も低域のスケールファクタｓｆｉ（１）を用いて説明する。 The scale factor indicates the quantization width in the quantization. For example, in MPEG-2 / 4 AAC, a quantization spectrum q is given to a certain frequency signal f.
q = INT ((f ^ 3/4) / 2 ^ (sf / 4) + 0.4054) (Equation 1)
The integer sf at this time is called a scale factor. The scale factor is commonly used for a plurality of continuous frequency signals. Sf1 (1), sf1 (2). . . It shall be written as sf1 (N). In the present embodiment, description will be made using the lowest scale factor sfi (1).

スケールファクタｓｆ１（１）〜ｓｆ６（１）は、チャンネル毎に独立した値を持つ。また、ビットレート一定の符号化の場合は一般的に、小さい音量に対してスケールファクタは小さな値を、大きな音量に対してスケールファクタは大きな値を示す。なお、例えばｓｆ１（１）は１チャンネル、ｓｆ６（１）は６チャンネルのスケールファクタを示す。ここで、チャンネルの番号は音像の位置を表しており、例えば１はセンタチャンネル、２は左前チャンネル、３は右前チャンネル、４は左後ろチャンネル、５は右後ろチャンネル、６は低音チャンネルである。 The scale factors sf1 (1) to sf6 (1) have independent values for each channel. In the case of encoding with a constant bit rate, the scale factor generally indicates a small value for a small volume, and the scale factor indicates a large value for a large volume. For example, sf1 (1) indicates a scale factor of 1 channel and sf6 (1) indicates a scale factor of 6 channels. Here, the channel number represents the position of the sound image. For example, 1 is the center channel, 2 is the left front channel, 3 is the right front channel, 4 is the left rear channel, 5 is the right rear channel, and 6 is the bass channel.

ゲイン決定部１９は、スケールファクタｓｆ１（１）〜ｓｆ６（１）に基づいて動的ゲインｇ１１〜ｇ６１，ｇ１２〜ｇ６２を決定する。 The gain determination unit 19 determines dynamic gains g11 to g61 and g12 to g62 based on the scale factors sf1 (1) to sf6 (1).

時間周波数逆変換部２１は、逆変形離散コサイン変換(ＩＭＤＣＴ)などの時間−周波数逆変換を行う。すなわち、各帯域の周波数信号ｆ１〜ｆ６を逆変換することによって帯域を合成し時系列の音響信号ｉ１〜ｉ６を得る。 The time-frequency inverse transform unit 21 performs time-frequency inverse transform such as inverse modified discrete cosine transform (IMDCT). That is, the frequency signals f1 to f6 in each band are inversely transformed to synthesize the bands to obtain time-series acoustic signals i1 to i6.

ミックスダウン部２３は、音響信号ｉ１〜ｉ６と動的ゲインｇ１１〜ｇ６１，ｇ１２〜ｇ６２に基づいて出力信号ｏ１，ｏ２を算出する。 The mixdown unit 23 calculates the output signals o1 and o2 based on the acoustic signals i1 to i6 and the dynamic gains g11 to g61 and g12 to g62.

図３はゲイン決定部１９の構成を示す図である。図３において、ゲイン決定部１９は、重み付け器３１と正規化器３３，３５から構成されている。 FIG. 3 is a diagram illustrating a configuration of the gain determination unit 19. In FIG. 3, the gain determination unit 19 includes a weighter 31 and normalizers 33 and 35.

ハフマン符号デコーダ部１５から出力されたスケールファクタｓｆ１（１）〜ｓｆ６（１）は、乗算器ＭＰ１１〜ＭＰ６１、乗算器ＭＰ１２〜ＭＰ６２にそれぞれ入力される。静的ゲインｓ１１〜ｓ６１，ｓ１２〜ｓ６２は、予めメモリに記憶され固定的に与えられており、乗算器ＭＰ１１〜ＭＰ６１，ＭＰ１２〜ＭＰ６２にそれぞれ入力される。乗算器ＭＰ１１〜ＭＰ６１，ＭＰ１２〜ＭＰ６２では、入力されるスケールファクタｓｆ１（１）〜ｓｆ６（１）と静的ゲインｓ１１〜ｓ６１，ｓ１２〜ｓ６２とをそれぞれ乗算して得られた乗算値ｇ’１１〜６１，ｇ’１２〜６２をそれぞれ正規化器３３，３５に出力する。 The scale factors sf1 (1) to sf6 (1) output from the Huffman code decoder unit 15 are input to the multipliers MP11 to MP61 and the multipliers MP12 to MP62, respectively. The static gains s11 to s61 and s12 to s62 are stored in a memory in advance and fixedly given, and are input to the multipliers MP11 to MP61 and MP12 to MP62, respectively. Multipliers g'11 obtained by multiplying scale factors sf1 (1) to sf6 (1) and static gains s11 to s61 and s12 to s62 by multipliers MP11 to MP61 and MP12 to MP62, respectively. To 61 and g′12 to 62 are output to the normalizers 33 and 35, respectively.

正規化器３３，３５は、乗算器ＭＰ１１〜ＭＰ６１，ＭＰ１２〜ＭＰ６２から出力された乗算値ｇ’１１〜６１，ｇ’１２〜６２を入力し、正規化して動的ゲインｇ１１〜６１，ｇ１２〜６２を算出し、ミックスダウン部２３に出力する。 The normalizers 33 and 35 receive the multiplication values g′11 to 61 and g′12 to 62 output from the multipliers MP11 to MP61 and MP12 to MP62, normalize them, and perform dynamic gains g11 to 61, g12 to 62 is calculated and output to the mixdown unit 23.

図４はミックスダウン部２３の構成を示す図である。 FIG. 4 is a diagram illustrating a configuration of the mixdown unit 23.

ゲイン決定部１９から出力された動的ゲインｇ１１〜６１は、乗算器ＩＭＰ１１〜６１にそれぞれ入力され、かつこれらの乗算器に音響信号ｉ１〜ｉ６が入力されており、動的ゲインｇ１１〜６１と音響信号ｉ１〜ｉ６とが乗算されたそれぞれの乗算結果値が並列に加算器ＡＤＤ１１〜１５にそれぞれ入力され、直列接続された加算器ＡＤＤ１１〜１５が順次に加算し出力信号ｏ１を算出する。 The dynamic gains g11 to 61 output from the gain determining unit 19 are respectively input to the multipliers IMP11 to 61, and the acoustic signals i1 to i6 are input to these multipliers. The respective multiplication result values multiplied by the acoustic signals i1 to i6 are input in parallel to the adders ADD11 to ADD15, and the adders ADD11 to 15 connected in series sequentially add to calculate the output signal o1.

同様に、ゲイン決定部１９から出力された動的ゲインｇ１２〜６２は、乗算器ＩＭＰ１２〜６２にそれぞれ入力され、かつこれらの乗算器に音響信号ｉ１〜ｉ６が入力されており、動的ゲインｇ１２〜６２と音響信号ｉ１〜ｉ６とが乗算されたそれぞれの乗算結果値が並列に加算器ＡＤＤ１２〜５２にそれぞれ入力され、直列接続された加算器ＡＤＤ１２〜５２が順次に加算し出力信号ｏ２を算出する。 Similarly, the dynamic gains g12 to 62 output from the gain determination unit 19 are respectively input to the multipliers IMP12 to IMP62, and the acoustic signals i1 to i6 are input to these multipliers. ˜62 and the acoustic signals i1 to i6 are respectively multiplied and input in parallel to the adders ADD12 to 52, and the adders ADD12 to 52 connected in series sequentially add to calculate the output signal o2. To do.

次に、図１〜図５を参照して、第１実施形態の音声復号装置１１の動作について説明する。なお、図５は正規化器３３，３５の動作を説明するためのフローチャートである。 Next, the operation of the speech decoding apparatus 11 according to the first embodiment will be described with reference to FIGS. FIG. 5 is a flowchart for explaining the operation of the normalizers 33 and 35.

シンタックス解読部１３では、入力された符号化データ列ｂｓから高能率符号化音響ストリームｈと量子化精度情報及び正規化情報を規定の文法に従って分離する。 The syntax decoding unit 13 separates the high-efficiency encoded acoustic stream h, the quantization accuracy information, and the normalized information from the input encoded data string bs according to a specified grammar.

次いで、ハフマン符号デコーダ部１５では、シンタックス解読１３からの高能率符号化音響ストリームｈを副情報で得られた符号化フォーマットに従ってハフマン復号し、複数の量子化スペクトルｑ１〜ｑ６とこの量子化スペクトル毎の量子化幅を示すスケールファクタｓｆ１（１）〜ｓｆ６（１）を出力する。 Next, the Huffman code decoder unit 15 performs Huffman decoding on the high-efficiency encoded acoustic stream h from the syntax decoding 13 according to the encoding format obtained from the sub information, and a plurality of quantized spectra q1 to q6 and the quantized spectrum Scale factors sf1 (1) to sf6 (1) indicating the quantization width for each are output.

次いで、逆量子化１７では、ハフマン符号デコーダ１５からの複数の量子化スペクトルｑ１〜ｑ６と複数のスケールファクタｓｆ１（１）〜ｓｆ６（１）に基づいて、帯域毎の元の周波数信号ｆ１〜ｆ６をそれぞれ再生する。 Next, in inverse quantization 17, based on the plurality of quantized spectra q1 to q6 from the Huffman code decoder 15 and the plurality of scale factors sf1 (1) to sf6 (1), the original frequency signals f1 to f6 for each band. Play each one.

次いで、ゲイン決定部では、ハフマン符号デコーダ部１５からの複数のスケールファクタｓｆ１（１）〜ｓｆ６（１）に基づいて出力系統毎の複数の動的ゲイン値ｇ１１〜６１，ｇ１２〜６２を決定する。 Next, the gain determination unit determines a plurality of dynamic gain values g11 to 61 and g12 to 62 for each output system based on the plurality of scale factors sf1 (1) to sf6 (1) from the Huffman code decoder unit 15. .

ここで、ゲイン決定部１９では、図３に示すように、ハフマン符号デコーダ部１５からの複数のスケールファクタｓｆ１（１）〜ｓｆ６（１）に予め設定された複数の静的ゲイン値ｓ１１〜ｓ６１，ｓ１２〜ｓ６２を乗算器ＭＰ１１〜ＭＰ６１，ＭＰ１２〜ＭＰ６２でそれぞれ乗算してスケールファクタ毎に重み付けしたゲインを表す複数の乗算値ｇ’１１〜６１，ｇ’１２〜６２を出力する。 Here, in the gain determination unit 19, as shown in FIG. 3, a plurality of static gain values s11 to s61 preset in the plurality of scale factors sf1 (1) to sf6 (1) from the Huffman code decoder unit 15. , S12 to s62 are multiplied by multipliers MP11 to MP61 and MP12 to MP62, respectively, and a plurality of multiplication values g′11 to 61 and g′12 to 62 representing gains weighted for each scale factor are output.

次いで、ゲイン決定部１９では、複数の乗算値ｇ’１１〜６１，ｇ’１２〜６２を正規化して出力系統毎の複数の動的ゲイン値ｇ１１〜６１，ｇ１２〜６２を算出し、正規化器３３，３５に出力する。 Next, the gain determination unit 19 normalizes the plurality of multiplication values g′11 to 61 and g′12 to 62 to calculate a plurality of dynamic gain values g11 to 61 and g12 to 62 for each output system, and normalizes them. To the devices 33 and 35.

詳しくは図５に示すように、ステップＳ１０では、重み付け器３１は、最低域のスケールファクタｓｆ１（１）〜ｓｆ６（１）を用いて、静的ゲインｓ１１〜ｓ６１，ｓ１２〜ｓ６２にスケールファクタｓｆ１（１）〜ｓｆ６（１）乗算して重み付けする。ステップＳ１０に示す数２のように、静的ゲインｓijにｓｆi（１）を乗算することで重み付けされたゲインｇ'ijを記述することができる。 Specifically, as shown in FIG. 5, in step S10, the weighting unit 31 uses the lowest scale factors sf1 (1) to sf6 (1) to set the scale factors sf1 to the static gains s11 to s61 and s12 to s62. (1) to sf6 (1) Multiply and weight. The weighted gain g′ij can be described by multiplying the static gain sij by sfi (1) as shown in Equation 2 shown in Step S10.

ここで、静的ゲインｓijの設定値の一例について説明する。なお、ｐ，ｑは個々のチャンネルの静的ゲインであり、o1，o2はそれぞれＬ／Ｒチャンネルのステレオの出力信号である。 Here, an example of the set value of the static gain sij will be described. P and q are static gains of individual channels, and o1 and o2 are stereo output signals of L / R channels, respectively.

p = (8 - 2 * sqrt(2)) / 14 = 0.3694 （数３）
q = (4 * sqrt(2) - 2) / 14 = 0.2612 （数４）
として、
o1 = p * i1 + p * i2 + q * i4 （数５）
o2 = p * i1 + p * i3 + q * i5 （数６）
となる。sqrt (ｘ) 関数は x の平方根のうち負でない方の値を意味している。
静的ゲインｓijを行列式で記述すると、

p = (8-2 * sqrt (2)) / 14 = 0.3694 (Equation 3)
q = (4 * sqrt (2)-2) / 14 = 0.2612 (Equation 4)
As
o1 = p * i1 + p * i2 + q * i4 (Equation 5)
o2 = p * i1 + p * i3 + q * i5 (Equation 6)
It becomes. The sqrt (x) function means the non-negative value of the square root of x.
When the static gain sij is described by a determinant,

となる。 It becomes.

次いで、ステップＳ２０では、正規化器３３，３５は、重み付けされたゲインｇ'を正規化し、動的ゲインｇ１１〜６１，ｇ１２〜６２を求める。ステップＳ２０に示すように、まず、ゲインｇ'1j〜ｇ’6jの合計値から分母の値を求め、ゲインｇ’ijをこの合計値で除算することで、動的ゲインｇijを求めるので、正規化は出力のチャンネルｉ毎に行われることになる。この正規化に関する数式は数８に示すように記述することができる。なお、ステップＳ２０で求めた動的ゲインｇ11〜ｇ61の合計値は「１」になることは言うまでもない。 Next, in step S20, the normalizers 33 and 35 normalize the weighted gain g ′ to obtain dynamic gains g11 to 61 and g12 to 62. As shown in step S20, first, the denominator value is obtained from the total value of the gains g′1j to g′6j, and the dynamic gain gij is obtained by dividing the gain g′ij by this total value. The conversion is performed for each output channel i. This normalization equation can be written as shown in Equation 8. Needless to say, the total value of the dynamic gains g11 to g61 obtained in step S20 is “1”.

次いで、時間周波数逆変換部２１では、逆量子化部１７からの帯域毎の周波数信号ｆ１〜ｆ６を逆変換して帯域合成し元の時系列の複数の音響信号ｉ１〜ｉ６を求める。 Next, the time-frequency inverse transform unit 21 inversely transforms the frequency signals f1 to f6 for each band from the inverse quantization unit 17 and performs band synthesis to obtain a plurality of original time series acoustic signals i1 to i6.

次いで、この時間周波数逆変換部２１からの複数の音響信号ｉ１〜ｉ６とこのゲイン決定部１９からの出力系統毎の複数の動的ゲインｇ１１〜６１，ｇ１２〜６２に基づいて、出力系統毎の出力信号o1，o2を算出する。 Next, based on the plurality of acoustic signals i1 to i6 from the time frequency inverse conversion unit 21 and the plurality of dynamic gains g11 to 61 and g12 to 62 for each output system from the gain determination unit 19, Output signals o1 and o2 are calculated.

ここで、ミックスダウン部２３では、時間周波数逆変換部２１からの複数の音響信号ｉ１〜ｉ６に正規化器３３からの１出力系統の複数の動的ゲインｇ１１〜６１をそれぞれ乗算器ＩＭＰ１１〜ＩＭＰ６１で乗算して重み付けし、乗算器ＩＭＰ１１〜ＩＭＰ６１からの複数の乗算値をＡＤＤ１１〜ＡＤＤ５１で加算して当該出力系統の出力信号o1を算出する。 Here, in the mixdown unit 23, a plurality of dynamic gains g 11 to 61 of one output system from the normalizer 33 are respectively multiplied to the plurality of acoustic signals i 1 to i 6 from the time-frequency inverse transform unit 21 by multipliers IMP 11 to IMP 61. Is multiplied and weighted, and a plurality of multiplication values from the multipliers IMP11 to IMP61 are added by the ADD11 to ADD51 to calculate the output signal o1 of the output system.

同様に、時間周波数逆変換部２１からの複数の音響信号ｉ１〜ｉ６に正規化器３５からの１出力系統の複数の動的ゲインｇ１２〜６２をそれぞれ乗算器ＩＭＰ１２〜ＩＭＰ６２で乗算して重み付けし、乗算器ＩＭＰ１２〜ＩＭＰ６２からの複数の乗算値をＡＤＤ１２〜ＡＤＤ５２で加算して当該出力系統の出力信号o2を算出する。 Similarly, a plurality of dynamic gains g12 to 62 of one output system from the normalizer 35 are respectively multiplied by the plurality of acoustic signals i1 to i6 from the time-frequency inverse transform unit 21 by the multipliers IMP12 to IMP62 and weighted. Then, a plurality of multiplied values from the multipliers IMP12 to IMP62 are added by the ADD12 to ADD52 to calculate the output signal o2 of the output system.

この結果、ミックスダウン部２３は音響信号ｉ１〜ｉ６に動的ゲインｇ１１〜ｇ６２を乗算した後、この乗算結果を２系統にまとめて加算することで、Ｌ／Ｒチャンネルの出力信号ｏ１，ｏ２を得ることができ、入力される音響信号ｉ１〜ｉ６を線形結合することができる。 As a result, the mixdown unit 23 multiplies the acoustic signals i1 to i6 by dynamic gains g11 to g62, and then adds the multiplication results in two systems to add the L / R channel output signals o1 and o2. The input acoustic signals i1 to i6 can be linearly combined.

ここで、音響信号ｉ１〜ｉ６とゲインｇ１１〜ｇ６２から出力信号ｏ１，ｏ２を求める方法を行列式で表すと数式８のようになる。

Here, when a method of obtaining the output signals o1 and o2 from the acoustic signals i1 to i6 and the gains g11 to g62 is expressed by a determinant, Expression 8 is obtained.

本実施形態によれば、ミックスダウンを行うためのゲインをゲイン決定部において動的に決定しておくため、各チャンネル（１〜６）のスケールファクタがほぼ等しい場合には、動的ゲインは静的ゲインに収束する。一方、各チャンネルのスケールファクタに偏りがある場合には、音量の大きなチャンネルに大きなゲインが割当てられる。この結果として、音量バランスの良いミックスダウンを行うことができる。 According to the present embodiment, since the gain for performing the mixdown is dynamically determined by the gain determination unit, the dynamic gain is static when the scale factors of the respective channels (1 to 6) are substantially equal. Converges to dynamic gain. On the other hand, when the scale factor of each channel is biased, a large gain is assigned to a channel with a large volume. As a result, a mixdown with a good volume balance can be performed.

また、動的ゲインを決定する過程では、ディジタルオーディオ信号の復号に用いたパラメータを流用するため、例えばＤＳＰでの処理負荷を極めて小さくすることができる。 Further, in the process of determining the dynamic gain, the parameters used for decoding the digital audio signal are diverted, so that the processing load on the DSP, for example, can be extremely reduced.

（第２実施形態）
図６は、本発明に係る第２実施形態のミックスダウン装置に適応可能な音声復号装置について説明するための構成を示すブロック図である。図７は、ゲイン決定部１９、ミックスダウン部５３及び時間周波数逆変換部５５の相互の接続関係を示す図である。 (Second Embodiment)
FIG. 6 is a block diagram showing a configuration for explaining a speech decoding apparatus applicable to the mixdown apparatus according to the second embodiment of the present invention. FIG. 7 is a diagram illustrating a mutual connection relationship of the gain determination unit 19, the mixdown unit 53, and the time frequency inverse conversion unit 55.

音声復号装置５１は、シンタックス解読部１３、ハフマン符号デコーダ部１５、逆量子化部１７、ゲイン決定部１９、ミックスダウン部５３、時間周波数逆変換部５５から構成されている。なお、第２実施形態の音声復号装置５１において、第１実施形態の音声復号装置１１において用いられている構成要件と同様のブロックについては同一の符号を付加し、その説明を省略する。 The speech decoding device 51 includes a syntax decoding unit 13, a Huffman code decoder unit 15, an inverse quantization unit 17, a gain determination unit 19, a mixdown unit 53, and a time-frequency inverse conversion unit 55. Note that in the speech decoding device 51 of the second embodiment, the same reference numerals are given to the same blocks as the constituent elements used in the speech decoding device 11 of the first embodiment, and the description thereof is omitted.

第２の実施形態の音声復号装置５１の特徴は、ミックスダウン部５３と時間周波数逆変換部５５を有することにある。これは、時間周波数逆変換の処理は線形性を有するため、周波数信号をミックスダウンしてから逆変換を行っても、第１の実施の形態と同じ結果が得られるからである。 The feature of the speech decoding apparatus 51 of the second embodiment is that it includes a mixdown unit 53 and a time-frequency inverse transform unit 55. This is because the time-frequency inverse transform process has linearity, and therefore the same result as in the first embodiment can be obtained even if the inverse transformation is performed after the frequency signal is mixed down.

ミックスダウン部５３は、逆量子化部１７からの帯域毎の周波数信号ｆ１〜ｆ６とゲイン決定１９からの出力系統毎の複数の動的ゲインｇ１１〜ｇ６２に基づいて、出力系統毎の周波数信号Ｆ１，Ｆ２を算出する。 Based on the frequency signals f1 to f6 for each band from the inverse quantization unit 17 and the plurality of dynamic gains g11 to g62 for each output system from the gain determination 19, the mixdown unit 53 uses the frequency signal F1 for each output system. , F2 is calculated.

時間周波数逆変換部５５は、このミックスダウン部５３からの周波数信号Ｆ１，Ｆ２を逆変換して帯域合成し時系列の音響信号o1，o2を求める。 The time-frequency inverse transform unit 55 inversely transforms the frequency signals F1 and F2 from the mixdown unit 53 and performs band synthesis to obtain time-series acoustic signals o1 and o2.

図８はミックスダウン部５３の構成を示す図である。 FIG. 8 is a diagram illustrating a configuration of the mixdown unit 53.

ミックスダウン部５３において、ゲイン決定部１９から出力された動的ゲインｇ１１〜６１は、乗算器ＩＭＰ１１〜６１にそれぞれ入力され、このそれぞれの乗算器に周波数信号ｆ１〜ｆ６が入力されており、動的ゲインｇ１１〜６１と周波数信号ｆ１〜ｆ６とが乗算器ＩＭＰ１１〜６１で乗算されそれぞれ重み付けされた乗算結果値が並列に加算器ＡＤＤ１１〜１５にそれぞれ入力され、直列接続された加算器ＡＤＤ１１〜５１が順次に加算し周波数信号Ｆ１を算出する。 In the mixdown unit 53, the dynamic gains g11 to 61 output from the gain determination unit 19 are respectively input to the multipliers IMP11 to 61, and the frequency signals f1 to f6 are input to the respective multipliers. Multiplicative gains g11-61 and frequency signals f1-f6 are multiplied by multipliers IMP11-61 and weighted multiplication result values are respectively input in parallel to adders ADD11-15, and adders ADD11-51 connected in series. Are sequentially added to calculate the frequency signal F1.

同様に、ゲイン決定部１９から出力された動的ゲインｇ１２〜６２は、乗算器ＩＭＰ１２〜６２にそれぞれ入力され、このそれぞれの乗算器に周波数信号ｆ１〜ｆ６が入力されており、動的ゲインｇ１２〜６２と周波数信号ｆ１〜ｆ６と乗算器ＩＭＰ１２〜６２で乗算されそれぞれ重み付けされた乗算結果値が並列に加算器ＡＤＤ１２〜５２にそれぞれ入力され、直列接続された加算器ＡＤＤ１２〜５２が順次に加算し周波数信号Ｆ２を算出する。 Similarly, the dynamic gains g12 to 62 output from the gain determining unit 19 are respectively input to the multipliers IMP12 to IMP62, and the frequency signals f1 to f6 are input to the respective multipliers, and the dynamic gain g12 is input. ~ 62, frequency signals f1 to f6 and multipliers IMP12 to 62 are multiplied and weighted multiplication result values are respectively input in parallel to adders ADD12 to 52, and serially connected adders ADD12 to 52 are sequentially added. The frequency signal F2 is calculated.

次に、図６〜図８を参照して、第２実施形態の音声復号装置５１の動作について説明する。なお、シンタックス解読部１３、ハフマン符号デコーダ部１５、逆量子化１７、ゲイン決定部でのそれぞれの処理内容は第１実施形態の音声復号装置１１と同様であるので、その説明を省略する。 Next, the operation of the speech decoding apparatus 51 according to the second embodiment will be described with reference to FIGS. Note that the processing contents of the syntax decoding unit 13, the Huffman code decoder unit 15, the inverse quantization unit 17, and the gain determination unit are the same as those of the speech decoding apparatus 11 of the first embodiment, and thus description thereof is omitted.

ミックスダウン部５３では、逆量子化部１７からの帯域毎の周波数信号ｆ１〜ｆ６とゲイン決定１９からの出力系統毎の複数の動的ゲインｇ１１〜ｇ６２に基づいて、出力系統毎の周波数信号Ｆ１，Ｆ２を算出する。 In the mixdown unit 53, the frequency signal F1 for each output system is based on the frequency signals f1 to f6 for each band from the inverse quantization unit 17 and the plurality of dynamic gains g11 to g62 for each output system from the gain determination 19. , F2 is calculated.

この結果、ミックスダウン部２３は周波数信号ｆ１〜ｆ６に動的ゲインｇ１１〜ｇ６２を乗算した後、この乗算結果を２系統にまとめて加算することで、Ｌ／Ｒチャンネルの周波数信号Ｆ１，Ｆ２を得ることができ、入力される周波数信号ｆ１〜ｆ６を線形結合することができる。 As a result, the mixdown unit 23 multiplies the frequency signals f1 to f6 by the dynamic gains g11 to g62, and then adds the multiplication results to two systems to add the L / R channel frequency signals F1 and F2. The input frequency signals f1 to f6 can be linearly combined.

ここで、周波数信号ｆ１〜ｆ６とゲインｇ１１〜ｇ６２から周波数信号Ｆ１，Ｆ２を求める方法を行列式で表すと数式９のようになる。

Here, when a method of obtaining the frequency signals F1 and F2 from the frequency signals f1 to f6 and the gains g11 to g62 is expressed by a determinant, Equation 9 is obtained.

また、動的ゲインを決定する過程では、ディジタルオーディオ信号の復号に用いたパラメータを流用するため、例えばＤＳＰ（Digital Signal Processor）での処理負荷を極めて小さくすることができる。 Further, in the process of determining the dynamic gain, the parameters used for decoding the digital audio signal are used, so that the processing load on, for example, a DSP (Digital Signal Processor) can be extremely reduced.

なお、本発明の実施形態においては、５．１ｃｈ信号をステレオ信号に変換した例を用いたが、複数のチャンネル数ｎ（ｎ≧２）に対してミックスダウンを行っても、同様に適用可能になる。また、本発明の実施形態では、動的ゲインを決定するのに最も低域のスケールファクタを用いたが、別の帯域のスケールファクタを用いても良く、また、複数のスケールファクタを用いても良い。 In the embodiment of the present invention, an example in which a 5.1ch signal is converted into a stereo signal is used. However, the present invention can be similarly applied even when a mixdown is performed on a plurality of channels n (n ≧ 2). become. In the embodiment of the present invention, the lowest scale factor is used to determine the dynamic gain. However, a scale factor of another band may be used, or a plurality of scale factors may be used. good.

また、本実施形態においては、ＭＰＥＧ−２／４ＡＡＣに用いる音声復号装置に適応させてミックスダウン装置に関する説明をしたが、本発明のミックスダウン装置はこのようなＭＰＥＧ−２／４ＡＡＣに限定するものではなく、他のマルチチャンネル再生方式に対しても適応可能である。 In the present embodiment, the mixdown apparatus has been described in conformity with the audio decoding apparatus used for MPEG-2 / 4 AAC. However, the mixdown apparatus of the present invention is limited to such MPEG-2 / 4 AAC. However, the present invention can be applied to other multi-channel playback systems.

本発明に係る第１実施形態のミックスダウン装置に適応可能な音声復号装置について説明するための構成を示すブロック図である。It is a block diagram which shows the structure for demonstrating the audio | voice decoding apparatus applicable to the mixdown apparatus of 1st Embodiment which concerns on this invention. ゲイン決定部１９、時間周波数逆変換部２１及びミックスダウン部２３の相互の接続関係を示す図である。It is a figure which shows the mutual connection relation of the gain determination part 19, the time frequency reverse transformation part 21, and the mixdown part 23. FIG. ゲイン決定部１９の構成を示す図である。3 is a diagram illustrating a configuration of a gain determination unit 19. FIG. ミックスダウン部２３の構成を示す図である。3 is a diagram illustrating a configuration of a mixdown unit 23. FIG. 正規化器３３，３５の動作を説明するためのフローチャートである。4 is a flowchart for explaining the operation of normalizers 33 and 35; 本発明に係る第２実施形態のミックスダウン装置に適応可能な音声復号装置について説明するための構成を示すブロック図である。It is a block diagram which shows the structure for demonstrating the audio | voice decoding apparatus applicable to the mixdown apparatus of 2nd Embodiment which concerns on this invention. ゲイン決定部１９、ミックスダウン部５３及び時間周波数逆変換部５５の相互の接続関係を示す図である。It is a figure which shows the mutual connection relation of the gain determination part 19, the mixdown part 53, and the time frequency reverse conversion part 55. FIG. ミックスダウン部５３の構成を示す図である。3 is a diagram illustrating a configuration of a mixdown unit 53. FIG.

Explanation of symbols

１１，５１…音声復号装置、１３…シンタックス解読部、１５…ハフマン符号デコーダ部、１７…逆量子化部、１９…ゲイン決定部、２１…時間周波数逆変換部、２３…ミックスダウン部２３、３１…重み付け器、３３，３５…正規化器、ＩＭＰ１１〜６２…乗算器、ＡＤＤ１１〜ＡＤＤ５２…加算器、５３…ミックスダウン部、５５…時間周波数逆変換部
DESCRIPTION OF SYMBOLS 11,51 ... Speech decoding apparatus, 13 ... Syntax decoding part, 15 ... Huffman code decoder part, 17 ... Dequantization part, 19 ... Gain determination part, 21 ... Time frequency inverse transformation part, 23 ... Mixdown part 23, 31 ... Weighting unit, 33, 35 ... Normalizer, IMP11-62 ... Multiplier, ADD11-ADD52 ... Adder, 53 ... Mixdown unit, 55 ... Time-frequency inverse conversion unit

Claims

In a mixdown device that mixes a plurality of input acoustic signals and converts them into a smaller number of output signals than the number of acoustic signals,
Gain determining means for determining a dynamic gain value of the acoustic signal based on the scale factor of each acoustic signal;
Mixing means for multiplying the acoustic signal by a dynamic gain value determined for the acoustic signal, and outputting a sum of multiplication result values corresponding to the plurality of acoustic signals as an output signal. Features a mixdown device.

The gain determining means includes
Weighting means for multiplying a preset static gain value corresponding to each scale factor and outputting a multiplication result value for each acoustic signal;
The mixdown device according to claim 1, further comprising: a normalizing unit that normalizes a plurality of multiplication result values from the weighting unit and calculates a dynamic gain value corresponding to each acoustic signal.

In a mixdown method of mixing a plurality of input acoustic signals and converting them to a number of output signals smaller than the number of acoustic signals,
A gain determining step of determining a dynamic gain value of the acoustic signal based on a scale factor of each acoustic signal;
A mixdown step of multiplying the acoustic signal by a dynamic gain value determined for the acoustic signal and outputting a sum of multiplication result values corresponding to the plurality of acoustic signals as an output signal. How to mix down.