JP2011081316A

JP2011081316A - Sound volume control device and electronic equipment

Info

Publication number: JP2011081316A
Application number: JP2009235502A
Authority: JP
Inventors: Masahiro Yoshida; 昌弘吉田; Tomoki Oku; 智岐奥; Makoto Yamanaka; 誠山中
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 2009-10-09
Filing date: 2009-10-09
Publication date: 2011-04-21

Abstract

<P>PROBLEM TO BE SOLVED: To perform band classified sound volume control in a small processing amount or by a small scale circuit. <P>SOLUTION: A sound volume control device 10 includes: a normalization section 11 for normalizing a signal level of an input sound signal (pcm[i]) on a time domain by amplifying it with an amplification rate gain<SB>A</SB>; a modified discrete cosine transformation (MDCT) section 12 for generating a sound signal(mdct<SB>OUT</SB>[f]) on a frequency domain by performing the band division of the input sound signal by applying modified discrete cosine transformation to an output signal (mdct<SB>IN</SB>[i]) of the normalization section 11; a sound volume analysis section 14 for deriving a sound volume control amount (gain<SB>B</SB>[j]/gain<SB>A</SB>) by the division band by analyzing the sound volume of the input sound signal by the division band; and a sound volume control section 15 for amplifying an output sound signal of the MDCT section 12 by the division band by the sound volume control amount. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、音響信号の音量を制御する音量制御装置及び該音量制御装置を利用した電子機器に関する。 The present invention relates to a volume control device for controlling the volume of an acoustic signal, and an electronic apparatus using the volume control device.

音響信号の音量を適切なレベルに調整する機能は、一般的な音響信号用ＬＳＩに搭載されており、この機能は、通常、アナログ又はデジタル信号上で時間領域上の音響信号の振幅レベルを調整することで実現される。 The function to adjust the volume of the acoustic signal to an appropriate level is mounted on a general acoustic signal LSI, and this function usually adjusts the amplitude level of the acoustic signal in the time domain on an analog or digital signal. It is realized by doing.

図１４に、該機能が適用された、符号化器を有する録音装置の概略ブロック図を示す。図１４の録音装置では、時間領域上の入力音響信号に基づき、該入力音響信号の音量が解析され、その解析結果に基づいて入力音響信号の音量が時間領域上において調整される。但し、この方法では、花火の炸裂音やドラムの打撃音などの大きな音が生じている区間において、音量制御部は音響信号全体の音量を下げようとする。この結果、花火やドラムの音は適切な音量にて記録されるが、人の声やドラム以外の楽器音が過度に小さくなってしまうことがある。 FIG. 14 shows a schematic block diagram of a recording apparatus having an encoder to which the function is applied. In the recording apparatus of FIG. 14, the volume of the input acoustic signal is analyzed based on the input acoustic signal in the time domain, and the volume of the input acoustic signal is adjusted in the time domain based on the analysis result. However, in this method, the volume control unit tries to lower the volume of the entire acoustic signal in a section where a loud sound such as a fireworks burst sound or a drum hitting sound is generated. As a result, fireworks and drum sounds are recorded at an appropriate volume, but human voices and instrument sounds other than drums may become excessively low.

これを考慮し、周波数帯域別に音量を制御する方法が提案されている（例えば、下記特許文献１参照）。図１５に、周波数帯域別に音量を制御するための従来構成のブロック図を示す。図１５の構成では、時間領域上の入力音響信号を通過帯域が互いに異なる複数のバンドパスフィルタに通すことで帯域分割を行い、帯域ごとに、入力音響信号の音量を個別調整する。この方法によれば、花火などの低音と人の声などを独立に音量制御することが可能となり、音量の安定化が図られる。 In consideration of this, a method of controlling the volume for each frequency band has been proposed (for example, see Patent Document 1 below). FIG. 15 shows a block diagram of a conventional configuration for controlling the volume for each frequency band. In the configuration of FIG. 15, band division is performed by passing an input acoustic signal in the time domain through a plurality of bandpass filters having different pass bands, and the volume of the input acoustic signal is individually adjusted for each band. According to this method, it is possible to independently control the volume of a low sound such as fireworks and a human voice, and the volume can be stabilized.

しかしながら、図１５の構成では、複数のバンドパスフィルタが必要となるため、ソフトウェアの処理量又は回路規模が大きくなる。 However, since the configuration of FIG. 15 requires a plurality of bandpass filters, the amount of software processing or the circuit scale increases.

また、これとは別に、音量制御を含む処理の過程において生じる演算誤差（丸め誤差や桁落ち誤差等）の影響を抑制することは重要である。 Apart from this, it is important to suppress the influence of calculation errors (rounding error, precision error, etc.) that occur in the process including volume control.

特開２０００−２７８７８６号公報JP 2000-278786 A

そこで本発明は、帯域別音量制御を低処理量又は小回路規模にて実現する音量制御装置を提供することを目的とする。また本発明は、音量制御を含む処理の過程において生じる演算誤差の影響を抑制可能な音量制御装置を提供することを目的とする。また本発明は、それらの音量制御装置を利用した電子機器を提供することを目的とする。 Therefore, an object of the present invention is to provide a volume control apparatus that realizes volume control by band with a low processing amount or a small circuit scale. It is another object of the present invention to provide a volume control apparatus that can suppress the influence of calculation errors that occur in the course of processing including volume control. Another object of the present invention is to provide an electronic device using these volume control devices.

本発明に係る第１の音量制御装置は、フィルタバンクを有した符号化器に供給される時間領域上の入力音響信号の音量を、複数の分割帯域の夫々に対して解析する音量解析部と、前記入力音響信号に基づき前記フィルタバンクから出力される周波数領域上の音響信号の音量を、前記音量解析部の解析結果に基づき前記分割帯域ごとに制御する音量制御部と、を備えたことを特徴とする。 A first volume control device according to the present invention includes a volume analysis unit that analyzes a volume of an input acoustic signal on a time domain supplied to an encoder having a filter bank with respect to each of a plurality of divided bands. A volume control unit that controls the volume of the acoustic signal on the frequency domain output from the filter bank based on the input acoustic signal for each of the divided bands based on the analysis result of the volume analysis unit. Features.

符号化器のフィルタバンクの出力音響信号を帯域別音量制御の対象とすることで、小規模の処理又は回路を追加するだけで、帯域別音量制御を行うことが可能となる。 By setting the output acoustic signal of the filter bank of the encoder as the target of the volume control for each band, the volume control for each band can be performed only by adding a small-scale process or circuit.

また例えば、第１の音量制御装置は、前記入力音響信号の信号レベルを正規化する正規化部を更に備え、前記フィルタバンクの出力音響信号は、前記正規化後の入力音響信号から生成され、前記音量制御部は、前記正規化部における正規化の内容と前記音量解析部の解析結果に基づいて、前記フィルタバンクの出力音響信号の音量を前記分割帯域ごとに制御するとよい。 Further, for example, the first volume control device further includes a normalization unit that normalizes a signal level of the input acoustic signal, and the output acoustic signal of the filter bank is generated from the normalized input acoustic signal, The volume control unit may control the volume of the output acoustic signal of the filter bank for each divided band based on the normalization content in the normalization unit and the analysis result of the volume analysis unit.

フィルタバンクの後段にて音量制御を行う場合、フィルタバンク処理において生じる演算誤差の影響が懸念される。しかしながら、上記の如く構成して、フィルタバンクの前段にて入力音響信号の信号レベルの正規化を行うことにより、フィルタバンク処理において生じる演算誤差の影響が軽減され、演算誤差による音質劣化が抑制される。 When the volume control is performed at the subsequent stage of the filter bank, there is a concern about the influence of calculation error that occurs in the filter bank processing. However, by configuring as described above and normalizing the signal level of the input acoustic signal in the previous stage of the filter bank, the influence of the calculation error that occurs in the filter bank processing is reduced, and sound quality deterioration due to the calculation error is suppressed. The

本発明に係る第２の音量制御装置は、フィルタバンクを有した復号化器に供給される周波数領域上の入力音響信号の音量を制御する音量制御部と、前記音量制御部による音量制御後の前記入力音響信号に基づき前記フィルタバンクから出力される時間領域上の音響信号の音量を、複数の分割帯域の夫々に対して解析する音量解析部と、を備え、前記音量制御部は、前記入力音響信号の音量を、前記音量解析部の解析結果に基づき前記分割帯域ごとに制御することを特徴とする。 A second volume control device according to the present invention includes a volume control unit that controls a volume of an input acoustic signal on a frequency domain supplied to a decoder having a filter bank, and a volume control unit that performs volume control after the volume control unit. A volume analysis unit that analyzes the volume of the acoustic signal in the time domain output from the filter bank based on the input acoustic signal with respect to each of a plurality of divided bands, and the volume control unit includes the input The volume of the acoustic signal is controlled for each of the divided bands based on the analysis result of the volume analysis unit.

復号化器のフィルタバンクの入力音響信号を帯域別音量制御の対象とすることで、小規模の処理又は回路を追加するだけで、帯域別音量制御を行うことが可能となる。 By setting the input acoustic signal of the filter bank of the decoder as the target of the volume control for each band, the volume control for each band can be performed only by adding a small process or circuit.

本発明に係る第３の音量制御装置は、フィルタバンクを有した復号化器に供給される周波数領域上の入力音響信号の信号レベルを正規化する正規化部と、前記正規化部による正規化後の前記入力音響信号に基づき前記フィルタバンクから出力される時間領域上の音響信号の音量を、解析する音量解析部と、前記正規化部における正規化の内容と前記音量解析部の解析結果に基づいて、前記フィルタバンクの出力音響信号の音量を制御する音量制御部と、を備えたことを特徴とする。 A third volume control device according to the present invention includes a normalization unit that normalizes a signal level of an input acoustic signal on a frequency domain supplied to a decoder having a filter bank, and a normalization performed by the normalization unit The volume of the sound signal in the time domain output from the filter bank based on the later input sound signal is analyzed, and the normalization content in the normalization section and the analysis result of the volume analysis section And a volume control unit for controlling the volume of the output acoustic signal of the filter bank.

本発明に係る電子機器は、上記第１〜第３の音量制御装置の何れかを備える。 An electronic apparatus according to the present invention includes any one of the first to third volume control devices.

本発明によれば、帯域別音量制御を低処理量又は小回路規模にて実現する音量制御装置を提供することが可能となる。また、音量制御を含む処理の過程において生じる演算誤差の影響を抑制可能な音量制御装置を提供することが可能となる。また、それらの音量制御装置を利用した電子機器を提供することが可能となる。 ADVANTAGE OF THE INVENTION According to this invention, it becomes possible to provide the volume control apparatus which implement | achieves volume control according to a band with a low processing amount or a small circuit scale. In addition, it is possible to provide a volume control device that can suppress the influence of calculation errors that occur in the course of processing including volume control. In addition, it is possible to provide an electronic device using those volume control devices.

本発明の意義ないし効果は、以下に示す実施の形態の説明により更に明らかとなろう。ただし、以下の実施の形態は、あくまでも本発明の一つの実施形態であって、本発明ないし各構成要件の用語の意義は、以下の実施の形態に記載されたものに制限されるものではない。 The significance or effect of the present invention will become more apparent from the following description of embodiments. However, the following embodiment is merely one embodiment of the present invention, and the meaning of the term of the present invention or each constituent element is not limited to that described in the following embodiment. .

本発明の第１実施形態に係る音量制御装置の構成を表すブロック図である。It is a block diagram showing the structure of the volume control apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係り、時間的に近接した複数のフレームの関係を示す図である。It is a figure which concerns on 1st Embodiment of this invention and shows the relationship of the some flame | frame which adjoined temporally. １０２４個のサブ帯域が８つの分割帯域に分類される様子を示す図である。It is a figure which shows a mode that 1024 subbands are classified into eight division bands. 図１の音量制御装置と組み合わせて用いられるＡＡＣエンコーダの概略ブロック図である。It is a schematic block diagram of the AAC encoder used in combination with the volume control apparatus of FIG. 図１の音量制御部にて算出される分割帯域ごとの増幅率が周波数方向において平滑化される様子を示した図である。It is the figure which showed a mode that the gain for every division | segmentation zone | band calculated by the volume control part of FIG. 1 was smoothed in a frequency direction. 本発明の第２実施形態に係る音量制御装置の構成を表すブロック図である。It is a block diagram showing the structure of the volume control apparatus which concerns on 2nd Embodiment of this invention. 図６のＩＭＤＣＴ部の動作を説明するための図である。It is a figure for demonstrating operation | movement of the IMDCT part of FIG. 図６の音量制御装置と組み合わせて用いられるＡＡＣデコーダの概略ブロック図である。It is a schematic block diagram of the AAC decoder used in combination with the volume control apparatus of FIG. 本発明の第３実施形態に係る音量制御装置の構成を表すブロック図である。It is a block diagram showing the structure of the volume control apparatus which concerns on 3rd Embodiment of this invention. 図９の音量制御装置と組み合わせて用いられるＡＡＣデコーダの概略ブロック図である。FIG. 10 is a schematic block diagram of an AAC decoder used in combination with the volume control device of FIG. 9. 本発明の第４実施形態に係る録音装置の概略構成図である。It is a schematic block diagram of the recording device which concerns on 4th Embodiment of this invention. 本発明の第４実施形態に係る音響信号再生装置の概略構成図である。It is a schematic block diagram of the audio signal reproducing | regenerating apparatus which concerns on 4th Embodiment of this invention. 本発明の第４実施形態に係る撮像装置の概略構成図である。It is a schematic block diagram of the imaging device which concerns on 4th Embodiment of this invention. 従来の録音装置の概略構成図である。It is a schematic block diagram of the conventional recording device. 帯域別音量制御を行う従来装置の概略構成図である。It is a schematic block diagram of the conventional apparatus which performs volume control according to a zone | band.

以下、本発明の実施の形態につき、図面を参照して具体的に説明する。参照される各図において、同一の部分には同一の符号を付し、同一の部分に関する重複する説明を原則として省略する。 Hereinafter, embodiments of the present invention will be specifically described with reference to the drawings. In each of the drawings to be referred to, the same part is denoted by the same reference numeral, and redundant description regarding the same part is omitted in principle.

＜＜第１実施形態＞＞
本発明の第１実施形態を説明する。図１は、第１実施形態に係る音量制御装置１０の構成を表すブロック図である。第１実施形態では、本発明に係る帯域別音量制御を、ＡＡＣ(Advanced Audio Coding)エンコーダに適用することを想定する。符号化器の一種であるＡＡＣエンコーダは、与えられた音響信号を、ＭＰＥＧ（Moving Picture Experts Group）において規格化された所定の符号化方式にて符号化する。 << First Embodiment >>
A first embodiment of the present invention will be described. FIG. 1 is a block diagram illustrating a configuration of a volume control device 10 according to the first embodiment. In the first embodiment, it is assumed that the band-specific volume control according to the present invention is applied to an AAC (Advanced Audio Coding) encoder. An AAC encoder, which is a kind of encoder, encodes a given audio signal by a predetermined encoding method standardized in MPEG (Moving Picture Experts Group).

本実施形態では、符号化の対象信号がモノラルの音響信号であることを想定する。ＡＡＣエンコーダでは、フィルタバンクにおける処理として、変形離散コサイン変換（modified discrete cosine transform；以下、ＭＤＣＴともいう）が用いられる。ＡＡＣエンコーダにおいて、ＭＤＣＴの処理単位は時間領域上の信号に対して２０４８サンプル又は２５６サンプルであり、該処理単位は符号化の対象信号に応じて切り替えられるが、本実施形態では、説明の簡略化上、該処理単位が２０４８サンプルで固定されているものとする。 In the present embodiment, it is assumed that the signal to be encoded is a monaural sound signal. In the AAC encoder, a modified discrete cosine transform (hereinafter also referred to as MDCT) is used as processing in the filter bank. In the AAC encoder, the MDCT processing unit is 2048 samples or 256 samples with respect to the signal in the time domain, and the processing unit is switched according to the signal to be encoded. In this embodiment, the description is simplified. It is assumed that the processing unit is fixed at 2048 samples.

従って、ＡＡＣエンコーダに供給される時間領域上の入力音響信号は、２０４８サンプル分の信号値を包含するフレーム単位にて分割される。１つのフレームには、１つ以上のブロックが含まれるが、今、１つのフレームが１つのブロックから形成されるものとする。図２に、時間的に近接する複数のフレームの関係を示す。第１のフレーム、第２のフレーム、第３のフレーム、・・・、の順番で時間が進行する。各ブロックは、直前のブロックとの間でブロックの半分の長さの重複部分を有する。今の例の場合、１つのフレームが１つのブロックから形成されるため、各フレームも、直前のフレームとの間で１フレームの半分の長さの重複部分を有する。 Therefore, the input acoustic signal in the time domain supplied to the AAC encoder is divided into frame units including signal values for 2048 samples. One frame includes one or more blocks. Now, it is assumed that one frame is formed from one block. FIG. 2 shows the relationship between a plurality of frames that are close in time. Time advances in the order of the first frame, the second frame, the third frame,. Each block has an overlap portion that is half the length of the previous block. In the case of the present example, since one frame is formed from one block, each frame also has an overlap portion that is half the length of one frame with the immediately preceding frame.

以下、１つのフレームに注目し、注目フレームに対する音量制御装置１０の動作について説明する。注目フレーム内における入力音響信号の信号値（信号レベル）をｐｃｍ［ｉ］にて表す。ｐｃｍ［ｉ］は、注目フレーム内におけるｉ番目の入力音響信号の信号値である。ｉは、不等式「０≦ｉ≦２０４７」を満たす整数である。入力音響信号ｐｃｍ［ｉ］は、時間領域上の１６ビットのデジタル音響信号であり、０以上６５５３５以下のデジタル値をとる。ｐｃｍ［ｉ］が大きくなるほど、ｉ番目の入力音響信号の音量及び強度は大きいものとする。第１実施形態の以下の説明文における各音響信号は、特に記述なき限り、注目フレームについての音響信号である。 Hereinafter, focusing on one frame, the operation of the volume control apparatus 10 for the frame of interest will be described. The signal value (signal level) of the input acoustic signal in the frame of interest is represented by pcm [i]. pcm [i] is a signal value of the i-th input acoustic signal in the frame of interest. i is an integer satisfying the inequality “0 ≦ i ≦ 2047”. The input sound signal pcm [i] is a 16-bit digital sound signal in the time domain, and takes a digital value of 0 or more and 65535 or less. It is assumed that the volume and the intensity of the i-th input acoustic signal increase as pcm [i] increases. Unless otherwise specified, each acoustic signal in the following description of the first embodiment is an acoustic signal for the frame of interest.

図１の正規化部１１は、ｐｃｍ［０］〜ｐｃｍ［２０４７］の内の最大値を検出し、その検出最大値と、１６ビットで表現可能な最大のデジタル値（即ち６５５３５）とを一致させるために必要な増幅率ｇａｉｎ_Aを算出し、注目フレーム内の入力音響信号ｐｃｍ［ｉ］の全てを増幅率ｇａｉｎ_Aにて増幅する。ｐｃｍ［ｉ］を増幅率ｇａｉｎ_Aにて増幅したものを、ｍｄｃｔ_IN［ｉ］にて表す。従って、次式（Ａ１）が成立する。
ｍｄｃｔ_IN［ｉ］＝ｐｃｍ［ｉ］×ｇａｉｎ_A ・・・（Ａ１） The normalization unit 11 in FIG. 1 detects the maximum value of pcm [0] to pcm [2047], and matches the detected maximum value with the maximum digital value that can be expressed in 16 bits (ie, 65535). An amplification factor gain _A necessary for the calculation is calculated, and all of the input acoustic signals pcm [i] in the frame of interest are amplified by the amplification factor gain _A. A product obtained by amplifying pcm [i] with an amplification factor gain _{A is} represented by mdct _IN [i]. Therefore, the following equation (A1) is established.
mdct _IN [i] = pcm [i] × gain _A (A1)

即ち例えば、ｐｃｍ［０］〜ｐｃｍ［２０４７］の内の最大値がｐｃｍ［５０］である場合、等式「ｇａｉｎ_A＝６５５３５／ｐｃｍ［５０］」に従って増幅率ｇａｉｎ_Aを算出し、増幅率ｇａｉｎ_Aをｐｃｍ［ｉ］に乗じることでｍｄｃｔ_IN［ｉ］を導出する。このように、正規化部１１は、ＡＡＣエンコーダに供給される入力音響信号の信号レベル（より詳しくは、ＡＡＣエンコーダのフィルタバンクに相当するＭＤＣＴ部１２に供給される入力音響信号の信号レベル）を正規化する。上述の説明から明らかなように、この正規化は、注目フレーム内の最大信号値を所定の目標値と一致させるための正規化である。該目標値が、１６ビットで表現可能な最大のデジタル値（即ち６５５３５）であることを例示しているが、該目標値の設定方法はそれに限定されない。正規化部１１の出力信号、即ち、正規化部１１による正規化後の入力音響信号ｍｄｃｔ_IN［ｉ］は、ＭＤＣＴ部１２に与えられる。 That is, for example, when the maximum value of pcm [0] to pcm [2047] is pcm [50], the amplification factor gain _A is calculated according to the equation “gain _A = 65535 / pcm [50]”, and the amplification factor Multiplying gain _A by pcm [i] derives mdct _IN [i]. Thus, the normalization unit 11 determines the signal level of the input acoustic signal supplied to the AAC encoder (more specifically, the signal level of the input acoustic signal supplied to the MDCT unit 12 corresponding to the filter bank of the AAC encoder). Normalize. As is clear from the above description, this normalization is a normalization for matching the maximum signal value in the frame of interest with a predetermined target value. Although the target value is exemplified as the maximum digital value that can be expressed by 16 bits (that is, 65535), the method for setting the target value is not limited thereto. The output signal of the normalization unit 11, that is, the input acoustic signal mdct _IN [i] after normalization by the normalization unit 11 is given to the MDCT unit 12.

ＭＤＣＴ部１２は、ＡＡＣエンコーダにおける帯域分割フィルタバンクとして機能し、入力音響信号ｍｄｃｔ_IN［ｉ］の帯域を複数のサブ帯域に分割する。即ち、ＭＤＣＴ部１２は、正規化後の入力音響信号ｍｄｃｔ_IN［ｉ］に対して変形離散コサイン変換を行うことにより、ｍｄｃｔ_IN［ｉ］によって表される時間領域上の音響信号を周波数領域上の音響信号ｍｄｃｔ_OUT［ｆ］に変換する。ｍｄｃｔ_OUT［ｆ］は、ＭＤＣＴ係数とも呼ばれる。 The MDCT unit 12 functions as a band division filter bank in the AAC encoder, and divides the band of the input acoustic signal mdct _IN [i] into a plurality of subbands. That is, the MDCT unit 12 performs a modified discrete cosine transform on the normalized input acoustic signal mdct _IN [i], thereby converting the acoustic signal in the time domain represented by mdct _IN [i] into the frequency domain. To the acoustic signal mdct _OUT [f]. mdct _OUT [f] is also called MDCT coefficient.

ＭＤＣＴ部１２の変形離散コサイン変換により、正規化後の入力音響信号の全周波数帯域が１０２４個のサブ帯域に細分化される。ｍｄｃｔ_OUT［ｆ］は、正規化後の入力音響信号の、第ｆ番目のサブ帯域における信号強度を表し、ｆは不等式「０≦ｆ≦１０２３」を満たす整数である。従って、２０４８サンプル分の時間領域上の音響信号ｍｄｃｔ_IN［０］〜ｍｄｃｔ_IN［２０４７］から１０２４個のＭＤＣＴ係数ｍｄｃｔ_OUT［０］〜ｍｄｃｔ_OUT［１０２３］が得られる。 By the modified discrete cosine transform of the MDCT unit 12, the entire frequency band of the normalized input acoustic signal is subdivided into 1024 subbands. mdct _OUT [f] represents the signal strength of the normalized input acoustic signal in the f-th subband, and f is an integer that satisfies the inequality “0 ≦ f ≦ 1023”. Accordingly, 1024 MDCT coefficients mdct _OUT [0] to mdct _OUT [1023] are obtained from the acoustic signals mdct _IN [0] to mdct _IN [2047] in the time domain for 2048 samples.

周波数番号ｆが増大するにつれて、対応するサブ帯域の周波数は高くなるものとする。仮に例えば、入力音響信号ｐｃｍ［ｉ］のサンプリング周波数が４８ＫＨｚであるならば、第ｆ番目のサブ帯域は、（ｆ×（（４８０００÷２）／１０２４））Ｈｚ以上且つ（（ｆ＋１）×（（４８０００÷２）／１０２４））Ｈｚ未満の周波数帯域である。 It is assumed that the frequency of the corresponding subband increases as the frequency number f increases. For example, if the sampling frequency of the input acoustic signal pcm [i] is 48 KHz, the f-th sub-band is (f × ((48000/2) / 1024)) Hz or more and ((f + 1) × ( (48000/2) / 1024)) The frequency band is less than Hz.

音量解析部１４に内在するＦＦＴ部１３は、注目フレームの入力音響信号ｐｃｍ［ｉ］に対してフーリエ変換を行うことにより周波数領域上の音響信号を算出する。ＦＦＴ部１３によるフーリエ変換として、高速フーリエ変換（Fast Fourier Transform）を用いることができる。ＦＦＴ部１３にて算出される音響信号は、実数部ｆｆｔ＿ｒ［ｆ］と虚数部ｆｆｔ＿ｉ［ｆ］から成る。ｆｆｔ＿ｒ［ｆ］及びｆｆｔ＿ｉ［ｆ］は、夫々、２０４８サンプル分の入力音響信号ｐｃｍ［ｉ］に対するフーリエ変換の結果の実数部及び虚数部である。離散フーリエ変換の一種である高速フーリエ変換により、入力音響信号の全周波数帯域が１０２４個のサブ帯域に細分化される。細分化の方法は、ＭＤＣＴ部１２におけるそれと同じであるとする。従って、ｆｆｔ＿ｒ［ｆ］及びｆｆｔ＿ｉ［ｆ］におけるｆも０以上且つ１０２３以下の整数をとる。 The FFT unit 13 included in the volume analysis unit 14 calculates an acoustic signal in the frequency domain by performing Fourier transform on the input acoustic signal pcm [i] of the frame of interest. As the Fourier transform performed by the FFT unit 13, a fast Fourier transform can be used. The acoustic signal calculated by the FFT unit 13 includes a real part fft_r [f] and an imaginary part fft_i [f]. fft_r [f] and fft_i [f] are a real part and an imaginary part of the result of Fourier transform on the input acoustic signal pcm [i] for 2048 samples, respectively. Through the fast Fourier transform, which is a kind of discrete Fourier transform, the entire frequency band of the input acoustic signal is subdivided into 1024 subbands. The subdivision method is assumed to be the same as that in the MDCT unit 12. Therefore, f in fft_r [f] and fft_i [f] also takes an integer of 0 or more and 1023 or less.

第ｆ番目のサブ帯域における入力音響信号のパワーをｆｆｔ＿ｐｗ［ｆ］にて表すと、パワーｆｆｔ＿ｐｗ［ｆ］は、次式（Ａ２）にて算出される。ｆｆｔ＿ｐｗ［ｆ］の単位は、所定パワーを基準とするｄＢ（デシベル）である。
ｆｆｔ＿ｐｗ［ｆ］
＝２０×ｌｏｇ（ｆｆｔ＿ｒ［ｆ］²＋ｆｆｔ＿ｉ［ｆ］²）・・・（Ａ２） When the power of the input acoustic signal in the f-th sub-band is represented by fft_pw [f], the power fft_pw [f] is calculated by the following equation (A2). The unit of fft_pw [f] is dB (decibel) with a predetermined power as a reference.
fft_pw [f]
= 20 × log (fft_r [f] ² + fft_i [f] ² ) (A2)

音量解析部１４は、図３に示す如く、１０２４個のサブ帯域を１２８個ずつに分けることにより、全周波数帯域を第０〜第７の分割帯域に分割する。第ｊの分割帯域は、第（ｊ×１２８）番目のサブ帯域〜第（ｊ×１２８＋１２７）番目のサブ帯域を合成した帯域である。そして、音量解析部１４は、分割帯域ごとに分割帯域におけるパワーｆｆｔ＿ｐｗ［ｆ］の最大値を検出し、その最大値が、音量制御部１５の出力信号中において目標レベルＴ＿ｌｅｖとなるように増幅率ｇａｉｎ_B［ｊ］を算出する。ｊは、分割帯域の番号を表し、０以上７以下の整数値をとる。 As shown in FIG. 3, the volume analysis unit 14 divides the entire frequency band into the 0th to 7th divided bands by dividing the 1024 subbands into 128 pieces. The jth divided band is a band obtained by combining the (j × 128) th subband to the (j × 128 + 127) th subband. Then, the volume analysis unit 14 detects the maximum value of the power fft_pw [f] in the divided band for each divided band, and the amplification factor so that the maximum value becomes the target level T_lev in the output signal of the volume control unit 15. Gain _B [j] is calculated. j represents the number of the divided band and takes an integer value of 0 or more and 7 or less.

第０の分割帯域に注目して具体例を挙げる。第０の分割帯域は、第０番目〜第１２７番目のサブ帯域の合成帯域である。第０の分割帯域にはパワーｆｆｔ＿ｐｗ［０］〜ｆｆｔ＿ｐｗ［１２７］が属しているため、第０の分割帯域に対して、パワーｆｆｔ＿ｐｗ［０］〜ｆｆｔ＿ｐｗ［１２７］の内の最大値が検出される。パワーｆｆｔ＿ｐｗ［０］〜ｆｆｔ＿ｐｗ［１２７］の内、パワーｆｆｔ＿ｐｗ［１００］が最大である場合、「ｇａｉｎ_B［０］＝（Ｔ＿ｌｅｖ×ｇａｉｎ_A ²）／ｆｆｔ＿ｐｗ［１００］」に従って増幅率ｇａｉｎ_B［０］が求められる。第１〜第７の分割帯域に対する増幅率ｇａｉｎ_B［１］〜ｇａｉｎ_B［７］も同様にして求められる。即ち例えば、パワーｆｆｔ＿ｐｗ［１２８］〜ｆｆｔ＿ｐｗ［２５５］の内、パワーｆｆｔ＿ｐｗ［２００］が最大である場合には、「ｇａｉｎ_B［１］＝（Ｔ＿ｌｅｖ×ｇａｉｎ_A ²）／ｆｆｔ＿ｐｗ［２００］」に従って増幅率ｇａｉｎ_B［１］が求められる。 A specific example will be given focusing on the 0th divided band. The 0th divided band is a combined band of the 0th to 127th subbands. Since the powers fft_pw [0] to fft_pw [127] belong to the 0th divided band, the maximum value among the powers fft_pw [0] to fft_pw [127] is detected for the 0th divided band. The When the power fft_pw [100] is the maximum among the powers fft_pw [0] to fft_pw [127], an amplification factor gain _B [in accordance with “gain _B [0] = (T_lev × gain _A ² ) / fft_pw [100]”. 0] is required. The gains gain _B [1] to gain _B [7] for the first to seventh divided bands are obtained in the same manner. That is, for example, when the power fft_pw [200] is the maximum among the powers fft_pw [128] to fft_pw [255], according to “gain _B [1] = (T_lev × gain _A ² ) / fft_pw [200]”. An amplification factor gain _B [1] is obtained.

目標レベルＴ＿ｌｅｖは、音量制御部１５から出力される各分割帯域についての音響信号の音量が一定音量となるように決定した値であり、目標レベルＴ＿ｌｅｖを所望の値に予め設定しておくことができる。例えば、１６ビットのフルスケールの−２０ｄＢを目標レベルＴ＿ｌｅｖとして設定することができる。尚、音響信号の音量は音響信号のパワーに依存するため、音量解析部１４にて分割帯域ごとに入力音響信号の音量が解析されると言える。 The target level T_lev is a value determined so that the volume of the acoustic signal for each divided band output from the volume control unit 15 is a constant volume, and the target level T_lev may be set to a desired value in advance. it can. For example, -20 dB of 16-bit full scale can be set as the target level T_lev. Since the volume of the acoustic signal depends on the power of the acoustic signal, it can be said that the volume of the input acoustic signal is analyzed for each divided band by the volume analysis unit 14.

音量制御部１５は、正規化情報としての増幅率ｇａｉｎ_A及び増幅率ｇａｉｎ_B［０］〜ｇａｉｎ_B［７］に基づく増幅率（ｇａｉｎ_B［ｊ］／ｇａｉｎ_A）にて、分割帯域ごとに、ＭＤＣＴ部１２の出力音響信号ｍｄｃｔ_OUT［ｆ］の音量を制御する。音量制御部１５による音量制御後の信号ｍｄｃｔ_OUT［ｆ］をｍｄｃｔ_OUT［ｆ］’にて表す。 The volume control unit 15 uses the amplification factor gain _A as normalization information and the amplification factor (gain _B [j] / gain _A ) based on the amplification factors gain _B [0] to gain _B [7] for each divided band. The volume of the output acoustic signal mdct _OUT [f] of the MDCT unit 12 is controlled. The signal mdct _OUT [f] after the volume control by the volume control unit 15 is represented by mdct _OUT [f] ′.

ＭＤＣＴ部１２の出力音響信号ｍｄｃｔ_OUT［０］〜ｍｄｃｔ_OUT［１０２３］の内、第ｊの分割帯域に属する信号ｍｄｃｔ_OUT［ｊ×１２８］〜ｍｄｃｔ_OUT［ｊ×１２８＋１２７］に対しては増幅率（ｇａｉｎ_B［ｊ］／ｇａｉｎ_A）を用いて音量制御が成される。従って、音量制御部１５は、第０の分割帯域に属する信号ｍｄｃｔ_OUT［０］〜ｍｄｃｔ_OUT［１２７］を増幅率（ｇａｉｎ_B［０］／ｇａｉｎ_A）にて増幅することでｍｄｃｔ_OUT［０］’〜ｍｄｃｔ_OUT［１２７］’を生成する（即ち例えば、ｍｄｃｔ_OUT［０］’＝ｍｄｃｔ_OUT［０］×ｇａｉｎ_B［０］／ｇａｉｎ_A、である）。同様に、第１の分割帯域に属する信号ｍｄｃｔ_OUT［１２８］〜ｍｄｃｔ_OUT［２５５］を増幅率（ｇａｉｎ_B［１］／ｇａｉｎ_A）にて増幅することでｍｄｃｔ_OUT［１２８］’〜ｍｄｃｔ_OUT［２５５］’を生成する（即ち例えば、ｍｄｃｔ_OUT［１２８］’＝ｍｄｃｔ_OUT［１２８］×ｇａｉｎ_B［１］／ｇａｉｎ_A、である）。信号ｍｄｃｔ_OUT［２５６］〜ｍｄｃｔ_OUT［１０２３］についても同様である。 Among the output acoustic signals mdct _OUT [0] to mdct _OUT [1023] of the MDCT section 12, the amplification factor is applied to the signals mdct _OUT [j × 128] to mdct _OUT [j × 128 + 127] belonging to the jth divided band. Volume control is performed using (gain _B [j] / gain _A ). Therefore, the volume control unit 15 amplifies the signals mdct _OUT [0] to mdct _OUT [127] belonging to the 0th division band with the amplification factor (gain _B [0] / gain _A ) to obtain mdct _OUT [0. ] ′ To mdct _OUT [127] ′ (ie, for example, mdct _OUT [0] ′ = mdct _OUT [0] × gain _B [0] / gain _A ). Similarly, signals mdct _OUT [128] to mdct _OUT [255] belonging to the first division band are amplified at an amplification factor (gain _B [1] / gain _A ), thereby obtaining mdct _OUT [128] ′ to mdct _OUT. [255] ′ (ie, for example, mdct _OUT [128] ′ = mdct _OUT [128] × gain _B [1] / gain _A ). The same applies to the signals mdct _OUT [256] to mdct _OUT [1023].

尚、図１の構成では、増幅率（ｇａｉｎ_B［ｊ］／ｇａｉｎ_A）の算出を音量解析部１４において行って、求めた増幅率（ｇａｉｎ_B［ｊ］／ｇａｉｎ_A）を音量解析部１４から音量制御部１５に与えるようにしているが、正規化部１１から出力されるｇａｉｎ_A及び音量解析部１４から出力されるｇａｉｎ_B［ｊ］に基づき、音量制御部１５において増幅率（ｇａｉｎ_B［ｊ］／ｇａｉｎ_A）の算出を行うようにしても良い。 In the configuration of FIG. 1, the gain (gain _B [j] / gain _A ) is calculated in the volume analysis unit 14, and the obtained gain (gain _B [j] / gain _A ) is calculated in the volume analysis unit 14. Is supplied to the volume control unit 15 from the normalization unit 11 and gain _B [j] output from the volume analysis unit 14 based on the gain _A (gain _B). [J] / gain _A ) may be calculated.

エンコード後処理部１６は、音量制御部１５から出力される音響信号ｍｄｃｔ_OUT［ｆ］’ をＡＡＣの符号化方式に従って符号化することにより、音響信号ｍｄｃｔ_OUT［ｆ］’ を符号化音響信号としてのビットストリームに変換する。符号化音響信号を記録媒体１７に記録することができる。記録媒体１７は、半導体メモリや磁気ディスク等から形成される。 The post-encoding processing unit 16 encodes the acoustic signal mdct _OUT [f] ′ output from the volume control unit 15 in accordance with the AAC encoding method, and thereby converts the acoustic signal mdct _OUT [f] ′ as an encoded acoustic signal. To a bitstream of The encoded acoustic signal can be recorded on the recording medium 17. The recording medium 17 is formed from a semiconductor memory, a magnetic disk, or the like.

注目フレームに対する、図１の各部位の動作を説明したが、注目フレーム以外の各フレームに対しても上述と同様の動作が成され、各フレームに対する符号化音響信号が次々と記録媒体１７に記録される。 Although the operation of each part in FIG. 1 with respect to the frame of interest has been described, the same operation as described above is performed for each frame other than the frame of interest, and the encoded acoustic signal for each frame is recorded on the recording medium 17 one after another Is done.

音量制御装置１０の構成要素には、正規化部１１、音量解析部１４及び音量制御部１５が含まれる。ＭＤＣＴ部１２及び／又はエンコード後処理部１６は、音量制御装置１０の構成要素に含まれると解釈しても良いし、音量制御装置１０の構成要素に含まれないと解釈しても良い。 The components of the volume control device 10 include a normalization unit 11, a volume analysis unit 14, and a volume control unit 15. The MDCT unit 12 and / or the post-encoding processing unit 16 may be interpreted as being included in the components of the volume control device 10 or may be interpreted as not being included in the components of the volume control device 10.

図４に、音量制御装置１０を適用可能なＡＡＣエンコーダ２０の概略構成図を示す。ＡＡＣエンコーダ２０には、ＭＤＣＴ部１２と同じ機能を有するＭＤＣＴ部１２ａ及びエンコード後処理部１６と同じ機能を有するエンコード後処理部１６ａが備えられている。ＡＡＣエンコーダ２０を音量制御装置１０と組み合わせて用いる場合、ＭＤＣＴ部１２ａに正規化部１１の出力音響信号ｍｄｃｔ_IN［ｉ］を入力すると共に、ＭＤＣＴ部１２ａの出力音響信号ｍｄｃｔ_OUT［ｆ］に対して音量制御部１５による音量制御を行い、それによって得た音響信号ｍｄｃｔ_OUT［ｆ］’をエンコード後処理部１６ａに与えればよい。 FIG. 4 shows a schematic configuration diagram of an AAC encoder 20 to which the volume control device 10 can be applied. The AAC encoder 20 includes an MDCT unit 12 a having the same function as the MDCT unit 12 and an encode post-processing unit 16 a having the same function as the post-encoding processing unit 16. When the AAC encoder 20 is used in combination with the volume control device 10, the output acoustic signal mdct _IN [i] of the normalization unit 11 is input to the MDCT unit 12a and the output acoustic signal mdct _OUT [f] of the MDCT unit 12a is input. Then, the volume control unit 15 may perform volume control, and the acoustic signal mdct _OUT [f] ′ obtained thereby may be given to the post-encoding processing unit 16a.

本実施形態では、符号化器のフィルタバンクから出力される周波数領域上の音響信号が帯域別音量制御の対象とされる。このため、図１５に示す従来構成では必要であった複数のバンドパスフィルタが不要となり、小規模のソフトウェア処理又は回路の追加にて帯域別音量制御を実現することが可能となる。また、フィルタバンクの後段にて帯域別音量制御を行う場合、小さい音の状態でフィルタバンク処理（即ち、ＭＤＣＴ）が行われることがある。小さい音の状態でフィルタバンク処理を行うと、フィルタバンク処理における演算誤差（丸め誤差や桁落ち等）の影響が大きくなる。演算誤差を多く含む信号をフィルタバンクの後段で増幅すると音質の劣化が大きくなる。これを考慮し、音量制御装置１０では、符号化器のフィルタバンク（ＭＤＣＴ部１２）への入力音響信号の信号レベルを正規化部１１で正規化、即ち増幅し、増幅後の音響信号をフィルタバンクに入力する。これにより、上記演算誤差の影響が軽減されて音声劣化が抑制される。 In the present embodiment, the sound signal in the frequency domain output from the filter bank of the encoder is the target of the volume control for each band. For this reason, a plurality of band-pass filters that are necessary in the conventional configuration shown in FIG. 15 are not required, and it is possible to realize band-specific volume control by adding small-scale software processing or a circuit. In addition, when performing the volume control by band at the subsequent stage of the filter bank, the filter bank process (that is, MDCT) may be performed in a low sound state. When the filter bank processing is performed in a state where the sound is low, the influence of calculation errors (rounding error, digit loss, etc.) in the filter bank processing increases. When a signal containing a large amount of calculation error is amplified at the subsequent stage of the filter bank, the sound quality deteriorates. Considering this, the volume control apparatus 10 normalizes, ie, amplifies, the signal level of the input acoustic signal to the filter bank (MDCT section 12) of the encoder by the normalizing section 11, and filters the amplified acoustic signal. Enter into the bank. Thereby, the influence of the said calculation error is reduced and voice deterioration is suppressed.

尚、上述の説明では、音量制御部１５における制御量に相当する（ｇａｉｎ_B［ｊ］／ｇａｉｎ_A）をフレームごとに独立に決定しているが、時間的に隣接するフレーム間の音量変化が緩やかとなるように該制御量に過渡特性を持たせることも可能であるし、（ｇａｉｎ_B［ｊ］／ｇａｉｎ_A）が所定範囲内に収まるように該制御量に制約を課すようにしてもよい。 In the above description, (gain _B [j] / gain _A ) corresponding to the control amount in the sound volume control unit 15 is determined independently for each frame. However, the change in volume between temporally adjacent frames varies. It is possible to give the controlled variable a transient characteristic so as to be gradual, or to restrict the controlled variable so that (gain _B [j] / gain _A ) is within a predetermined range. Good.

また、上述の説明では、１つの分割帯域に属する全音響信号に対して共通の制御量（ｇａｉｎ_B［ｊ］／ｇａｉｎ_A）を適用するようにしているが、隣接分割帯域の境界部分において該制御量が不連続とならないように該制御量を周波数方向において平滑化するようにしても良い。つまり例えば、１２８個のサブ帯域ごとに段階的に変化する増幅率ｇａｉｎ_B［０］、ｇａｉｎ_B［１］、ｇａｉｎ_B［２］・・・を周波数方向において平滑化することで、図５の破線曲線２１０によって表されるような、ｆの関数ｇ［ｆ］を設定し、その関数とｇａｉｎ_Aを用いて音量制御部１５による音量制御を行うようにしても良い。この場合、ｍｄｃｔ_OUT［ｆ］に対してｇ［ｆ］／ｇａｉｎ_Aを乗じることでｍｄｃｔ_OUT［ｆ］’が得られる。 Further, in the above description, a common control amount (gain _B [j] / gain _A ) is applied to all acoustic signals belonging to one divided band. The control amount may be smoothed in the frequency direction so that the control amount is not discontinuous. That is, for example, by smoothing the gains gain _B [0], gain _B [1], gain _B [2]... That change stepwise for each of 128 subbands in the frequency direction, FIG. A function g [f] of f as represented by the dashed curve 210 may be set, and volume control by the volume control unit 15 may be performed using the function and gain _A. In this case, mdct _OUT [f] 'is obtained by multiplying the g [f] / gain _A with respect mdct _OUT [f].

また、正規化部１１を設けたことによる効果は失われるが、音量制御装置１０から正規化部１１を削除することも可能である。この場合、入力音響信号ｐｃｍ［ｉ］そのもの、即ち増幅率ｇａｉｎ_Aを１にした場合におけるｍｄｃｔ_IN［ｉ］からｍｄｃｔ_OUT［ｆ］が生成され、音量制御部１５では、第ｊの分割帯域に属する信号ｍｄｃｔ_OUT［ｆ］を増幅率ｇａｉｎ_B［ｊ］にて増幅することでｍｄｃｔ_OUT［ｆ］’を生成する（即ち例えば、ｍｄｃｔ_OUT［０］’＝ｍｄｃｔ_OUT［０］×ｇａｉｎ_B［０］、ｍｄｃｔ_OUT［１２８］’＝ｍｄｃｔ_OUT［１２８］×ｇａｉｎ_B［１］、となる）。 Moreover, although the effect by providing the normalization part 11 is lost, it is also possible to delete the normalization part 11 from the volume control apparatus 10. In this case, the input acoustic signal pcm [i] itself, that is, mdct _OUT [f] is generated from mdct _IN [i] when the gain gain _{A is set} to 1, and the volume control unit 15 generates the jth divided band. The mdct _OUT [f] ′ is generated by amplifying the signal mdct _OUT [f] to which it belongs at an amplification factor gain _B [j] (ie, mdct _OUT [0] ′ = mdct _OUT [0] × gain _B [j] 0], mdct _OUT [128] ′ = mdct _OUT [128] × gain _B [1].

＜＜第２実施形態＞＞
本発明の第２実施形態を説明する。第１実施形態では、音響信号が符号化される過程において帯域別の音量制御を実行しているが、第２実施形態では、符号化された音響信号が復号化されて再生される過程において帯域別の音量制御を実行する。第２実施形態においては、本発明に係る帯域別音量制御をＡＡＣデコーダに適用することを想定する。復号化器の一種であるＡＡＣデコーダは、ＡＡＣエンコーダにおいて生成された符号化音響信号を所定の復号化方式にて復号し、符号化の成される前の音響信号を生成する。 << Second Embodiment >>
A second embodiment of the present invention will be described. In the first embodiment, the volume control for each band is executed in the process of encoding the acoustic signal. In the second embodiment, the band is controlled in the process of decoding and reproducing the encoded acoustic signal. Perform another volume control. In the second embodiment, it is assumed that the volume control by band according to the present invention is applied to an AAC decoder. An AAC decoder, which is a kind of decoder, decodes an encoded acoustic signal generated by an AAC encoder by a predetermined decoding method, and generates an acoustic signal before being encoded.

本実施形態では、符号化音響信号がモノラルの音響信号であることを想定する。ＡＡＣデコーダでは、フィルタバンクにおける処理として逆変形離散コサイン変換（以下、ＩＭＤＣＴともいう）が用いられる。ＡＡＣデコーダにおいて、ＩＭＤＣＴの処理単位は時間領域上の信号に対して２０４８サンプル又は２５６サンプルであるが、本実施形態では、説明の簡略化上、該処理単位が２０４８サンプルで固定されているものとする。 In the present embodiment, it is assumed that the encoded sound signal is a monaural sound signal. In the AAC decoder, inverse modified discrete cosine transform (hereinafter also referred to as IMDCT) is used as processing in the filter bank. In the AAC decoder, the IMDCT processing unit is 2048 samples or 256 samples with respect to the signal in the time domain, but in this embodiment, the processing unit is fixed at 2048 samples for simplicity of explanation. To do.

従って、図６のＩＭＤＣＴ部３３から出力される時間領域上の音響信号ｐｃｍ［ｉ］を、２０４８サンプル分の信号値を包含するフレーム単位にて分割することができる。１つのフレームには、１つ以上のブロックが含まれるが、今、１つのフレームが１つのブロックから形成されるものとする。図２に示すものと同様、第１のフレーム、第２のフレーム、第３のフレーム、・・・、の順番で時間が進行するものとする。各ブロックは、直前のブロックとの間でブロックの半分の長さの重複部分を有する。今の例の場合、１つのフレームが１つのブロックから形成されるため、各フレームも、直前のフレームとの間で１フレームの半分の長さの重複部分を有する。尚、上述の説明からも明らかなように、本実施形態にて述べられるｐｃｍ［ｉ］は、ＩＭＤＣＴ部３３の出力音響信号であって、第１実施形態のそれとは異なる。 Therefore, the acoustic signal pcm [i] in the time domain output from the IMDCT unit 33 in FIG. 6 can be divided in units of frames including signal values for 2048 samples. One frame includes one or more blocks. Now, it is assumed that one frame is formed from one block. As in the case shown in FIG. 2, it is assumed that time advances in the order of the first frame, the second frame, the third frame,. Each block has an overlap portion that is half the length of the previous block. In the case of the present example, since one frame is formed from one block, each frame also has an overlap portion that is half the length of one frame with the immediately preceding frame. As is clear from the above description, pcm [i] described in this embodiment is an output acoustic signal of the IMDCT section 33 and is different from that of the first embodiment.

図６は、第２実施形態に係る音量制御装置３０の構成を表すブロック図である。ＡＡＣエンコーダによって符号化された音響信号である符号化音響信号は、図示されない記録媒体から読み出されてデコード前処理部３１に与えられる。デコード前処理部３１は、符号化音響信号を復号することで、ＡＡＣデコーダのフィルタバンクに相当するＩＭＤＣＴ部３３への入力信号を生成して出力する。但し、デコード前処理部３１の出力信号は音量制御部３２にて音量制御が成されてからＩＭＤＣＴ部３３に供給される。 FIG. 6 is a block diagram illustrating the configuration of the volume control device 30 according to the second embodiment. An encoded acoustic signal that is an acoustic signal encoded by the AAC encoder is read from a recording medium (not shown) and provided to the pre-decoding processing unit 31. The decoding preprocessing unit 31 generates and outputs an input signal to the IMDCT unit 33 corresponding to the filter bank of the AAC decoder by decoding the encoded acoustic signal. However, the output signal of the decoding preprocessing unit 31 is supplied to the IMDCT unit 33 after the volume control unit 32 controls the volume.

デコード前処理部３１から出力される１処理単位分の音響信号は、１０２４個のｉｍｄｃｔ_IN［ｆ］により形成される。第１実施形態と同様、ｆは不等式「０≦ｆ≦１０２３」を満たす整数であるとする。デコード前処理部３１の出力音響信号は周波数領域上の音響信号であって、ｉｍｄｃｔ_IN［ｆ］は、デコード前処理部３１の出力音響信号の、第ｆ番目のサブ帯域における信号強度を表している。 The acoustic signal for one processing unit output from the decoding preprocessing unit 31 is formed by 1024 imdct _IN [f]. As in the first embodiment, f is an integer that satisfies the inequality “0 ≦ f ≦ 1023”. The output acoustic signal of the decoding preprocessing unit 31 is an acoustic signal in the frequency domain, and imctt _IN [f] represents the signal intensity of the output acoustic signal of the decoding preprocessing unit 31 in the f-th sub-band. Yes.

音量制御部３２及び音量解析部３５では、１０２４個のサブ帯域が１２８個のサブ帯域ずつに分割されることで、第０〜第７の分割帯域が設定される。第ｆ番目のサブ帯域の定義及び第ｊ番目の分割帯域の定義は、第１実施形態にて述べたものと同様である（ｊは０以上７以下の整数）。 The volume control unit 32 and the volume analysis unit 35 divide the 1024 sub-bands into 128 sub-bands, thereby setting the 0th to 7th divided bands. The definition of the f-th sub-band and the definition of the j-th divided band are the same as those described in the first embodiment (j is an integer from 0 to 7).

音量制御部３２は、音量解析部３５にて算出される増幅率ｇａｉｎＢ［０］〜ｇａｉｎＢ［７］に基づき、分割帯域ごとに、デコード前処理部３１の出力音響信号ｉｍｄｃｔ_IN［ｆ］の音量を制御する。音量制御部３２による音量制御後の信号ｉｍｄｃｔ_IN［ｆ］を、ｉｍｄｃｔ_IN［ｆ］’にて表す。 Based on the gains gainB [0] to gainB [7] calculated by the volume analyzer 35, the volume controller 32 determines the volume of the output acoustic signal imdct _IN [f] of the decoding preprocessor 31 for each divided band. To control. The signal imdct _IN [f] after the sound volume control by the sound volume control unit 32 is represented by imcdt _IN [f] ′.

デコード前処理部３１の出力音響信号ｉｍｄｃｔ_IN［０］〜ｉｍｄｃｔ_IN［１０２３］の内、第ｊの分割帯域に属する信号ｉｍｄｃｔ_IN［ｊ×１２８］〜ｉｍｄｃｔ_IN［ｊ×１２８＋１２７］に対しては増幅率ｇａｉｎ_B［ｊ］を用いて音量制御が成される。従って、音量制御部３２は、第０の分割帯域に属する信号ｉｍｄｃｔ_IN［０］〜ｉｍｄｃｔ_IN［１２７］を増幅率ｇａｉｎ_B［０］にて増幅することでｉｍｄｃｔ_IN［０］’〜ｉｍｄｃｔ_IN［１２７］’を生成する（即ち例えば、ｉｍｄｃｔ_IN［０］’＝ｉｍｄｃｔ_IN［０］×ｇａｉｎ_B［０］、である）。同様に、第１の分割帯域に属する信号ｉｍｄｃｔ_IN［１２８］〜ｉｍｄｃｔ_IN［２５５］を増幅率ｇａｉｎ_B［１］にて増幅することでｉｍｄｃｔ_IN［１２８］’〜ｉｍｄｃｔ_IN［２５５］’を生成する（即ち例えば、ｉｍｄｃｔ_IN［１２８］’＝ｉｍｄｃｔ_IN［１２８］×ｇａｉｎ_B［１］、である）。信号ｉｍｄｃｔ_IN［２５６］〜ｉｍｄｃｔ_IN［１０２３］についても同様である。 Of the output acoustic signals imdct _IN [0] to imdct _IN [1023] of the decoding preprocessing unit 31, the signals imdct _IN [j × 128] to imdct _IN [j × 128 + 127] belonging to the jth divided band are used. Volume control is performed using the gain gain _B [j]. Therefore, the volume control unit 32 amplifies the signals imdct _IN [0] to imdct _IN [127] belonging to the 0th division band with the gain gain _B [0], thereby imdct _IN [0] ′ to imdct _IN. [127] ′ (i.e., imcdt _IN [0] ′ = imdcct _IN [0] × gain _B [0], for example). Similarly, signals imtct _IN [128] to imdct _IN [255] belonging to the first division band are amplified by amplification factor gain _B [1] to obtain imctt _IN [128] ′ to imdct _IN [255] ′. (I.e., imcdt _IN [128] ′ = imdcct _IN [128] × gain _B [1], for example). The same applies to the signals imdct _IN [256] to imdct _IN [1023].

ＩＭＤＣＴ部３３は、ＡＡＣデコーダにおける帯域合成フィルタバンクとして機能し、各サブ帯域の音響信号ｉｍｄｃｔ_IN［ｆ］’を合成して時間領域上の音響信号を生成する。即ち、ＩＭＤＣＴ部３３は、処理単位ごとに、音量制御部３２の出力音響信号ｉｍｄｃｔ_IN［ｆ］’に対して逆変形離散コサイン変換を行うことで、ｉｍｄｃｔ_IN［ｆ］’によって表される周波数領域上の音響信号を時間領域上の音響信号ｉｍｄｃｔ_OUT［ｉ］に変換する。ここで、ｉは０以上且つ２０４７以下の整数をとる。ＩＭＤＣＴ部３３において、１処理単位分のｉｍｄｃｔ_IN［ｆ］’（即ち、ｉｍｄｃｔ_IN［０］’〜ｉｍｄｃｔ_IN［１０２３］’）から、２０４８サンプル分のｉｍｄｃｔ_OUT［ｉ］（即ち、ｉｍｄｃｔ_OUT［０］〜ｉｍｄｃｔ_OUT［２０４７］）が得られる。 The IMDCT unit 33 functions as a band synthesis filter bank in the AAC decoder, and generates an acoustic signal in the time domain by synthesizing the acoustic signals imdct _IN [f] ′ of each subband. In other words, the IMDCT unit 33 performs inverse deformation discrete cosine transform on the output acoustic signal imdct _IN [f] ′ of the volume control unit 32 for each processing unit, so that the frequency represented by imdct _IN [f] ′. The acoustic signal on the area is converted into an acoustic signal imdct _OUT [i] on the time domain. Here, i takes an integer of 0 or more and 2047 or less. In the IMDCT unit 33, imtct _IN [f] ′ for one processing unit (ie, imdct _IN [0] ′ to imdct _IN [1023] ′) is used to generate 2048 samples of imdct _OUT [i] (ie, imdct _OUT [ 0] to immdct _OUT [2047]).

更に、図７に示す如く、ＩＭＤＣＴ部３３は、第（ｋ−１）番目の処理単位についてのｉｍｄｃｔ_IN［０］’〜ｉｍｄｃｔ_IN［１０２３］’から得た２０４８サンプル分のｉｍｄｃｔ_OUT［ｉ］に所定の窓関数を乗じて得た２０４８サンプル分の音響信号Ｗ_k［ｉ］と、第ｋ番目の処理単位についてのｉｍｄｃｔ_IN［０］’〜ｉｍｄｃｔ_IN［１０２３］’から得た２０４８サンプル分のｉｍｄｃｔ_OUT［ｉ］に上記窓関数を乗じて得た２０４８サンプル分の音響信号Ｗ_k+1［ｉ］とを、時間方向に５０％ずつオーバーラップさせることで１０２４サンプル分の音響信号ｐｃｍ［ｉ］（即ち、ｐｃｍ［０］〜ｐｃｍ［１０２３］）を生成する。 Further, as illustrated in FIG. 7, the IMDCT unit 33 generates 2048 samples of imctt _OUT [i] obtained from imdt _IN [0] ′ to imdt _IN [1023] ′ for the (k−1) -th processing unit. 2048 samples of acoustic signals W _k [i] obtained by multiplying a predetermined window function and 2048 samples obtained from imdct _IN [0] ′ to imdct _IN [1023] ′ for the k-th processing unit 1024 samples of acoustic signals W _{k + 1} [i] obtained by multiplying imctt _OUT [i] by 2048 samples are overlapped by 50% in the time direction so that 1024 samples of acoustic signals pcm [ i] (ie, pcm [0] to pcm [1023]).

即ち、図７に示す如く、２０４８サンプル分の音響信号Ｗ_k［ｉ］の後半１０２４サンプルであるＷ_k［１０２４］〜Ｗ_k［２０４７］を音響信号Ｗ_k［ｉ］から抽出する一方で、２０４８サンプル分の音響信号Ｗ_k+1［ｉ］の前半１０２４サンプルであるＷ_k［０］〜Ｗ_k［１０２３］を音響信号Ｗ_k+1［ｉ］から抽出し、０≦ｉ≦１０２３を満たす夫々のｉに対して、等式「ｐｃｍ［ｉ］＝Ｗ_k［１０２４＋ｉ］＋Ｗ_k+1［０＋ｉ］」に従い、ｐｃｍ［ｉ］を求める。 That is, as shown in FIG. 7, 2048 to _{_{W k [1024] ~W k [}} 2047] a second half 1024 samples of the samples of the sound signal W _k [i] while extracting from the acoustic signal W _k [i], W _k [0] to W _k [1023], which are the first 1024 samples of the acoustic signal W _{k + 1} [i] for 2048 samples, are extracted from the acoustic signal W _{k + 1} [i], and 0 ≦ i ≦ 1023 is set. For each i that satisfies, pcm [i] is determined according to the equation “pcm [i] = W _k [1024 + i] + W _{k + 1} [0 + i]”.

ＩＭＤＣＴ部３３にて得られた１０２４サンプル分のｐｃｍ［ｉ］と、それに時間的に連続する１０２４サンプル分のｐｃｍ［ｉ］は、時間領域上における１フレーム分の音響信号（時間領域上における２０４８サンプル分の音響信号）を形成する。 The pcm [i] for 1024 samples obtained by the IMDCT unit 33 and the pcm [i] for 1024 samples that are temporally continuous thereto are acoustic signals for one frame on the time domain (2048 on the time domain). A sample acoustic signal) is formed.

音量解析部３５に内在するＦＦＴ部３４は、注目フレームについての２０４８サンプル分の音響信号ｐｃｍ［ｉ］に対してフーリエ変換を行うことにより周波数領域上の音響信号を算出する。ＦＦＴ部３４によるフーリエ変換として、高速フーリエ変換を用いることができる。ＦＦＴ部３４にて算出される音響信号は、実数部ｆｆｔ＿ｒ［ｆ］と虚数部ｆｆｔ＿ｉ［ｆ］から成る。本実施形態におけるｆｆｔ＿ｒ［ｆ］及びｆｆｔ＿ｉ［ｆ］は、夫々、ＩＭＤＣＴ部３３の出力音響信号ｐｃｍ［ｉ］に対するフーリエ変換の結果の実数部及び虚数部である。離散フーリエ変換の一種である高速フーリエ変換により、音響信号ｐｃｍ［ｉ］の全周波数帯域が１０２４個のサブ帯域に細分化される。従って、ｆｆｔ＿ｒ［ｆ］及びｆｆｔ＿ｉ［ｆ］におけるｆも０以上且つ１０２３以下の整数をとる。 The FFT unit 34 included in the sound volume analysis unit 35 calculates an acoustic signal in the frequency domain by performing Fourier transform on the acoustic signal pcm [i] for 2048 samples for the frame of interest. As the Fourier transform performed by the FFT unit 34, a fast Fourier transform can be used. The acoustic signal calculated by the FFT unit 34 includes a real part fft_r [f] and an imaginary part fft_i [f]. Fft_r [f] and fft_i [f] in the present embodiment are a real part and an imaginary part of the result of Fourier transform on the output acoustic signal pcm [i] of the IMDCT section 33, respectively. The entire frequency band of the acoustic signal pcm [i] is subdivided into 1024 subbands by fast Fourier transform, which is a kind of discrete Fourier transform. Therefore, f in fft_r [f] and fft_i [f] also takes an integer of 0 or more and 1023 or less.

第ｆ番目のサブ帯域における音響信号ｐｃｍ［ｉ］のパワーをｆｆｔ＿ｐｗ［ｆ］にて表すと、パワーｆｆｔ＿ｐｗ［ｆ］は、上記式（Ａ２）にて算出される。但し、本実施形態におけるｆｆｔ＿ｐｗ［ｆ］は、ＩＭＤＣＴ部３３の出力音響信号ｐｃｍ［ｉ］に基づくパワーである。音量解析部３５は、分割帯域ごとに分割帯域におけるパワーｆｆｔ＿ｐｗ［ｆ］の最大値を検出し、その最大値が、音量制御部３２の出力信号中において目標レベルＴ＿ｌｅｖとなるように増幅率ｇａｉｎ_B［ｊ］を算出する。 When the power of the acoustic signal pcm [i] in the f-th sub-band is represented by fft_pw [f], the power fft_pw [f] is calculated by the above formula (A2). However, fft_pw [f] in the present embodiment is power based on the output acoustic signal pcm [i] of the IMDCT unit 33. The volume analysis unit 35 detects the maximum value of the power fft_pw [f] in each divided band for each divided band, and the amplification factor gain _B so that the maximum value becomes the target level T_lev in the output signal of the volume control unit 32. [J] is calculated.

分割帯域ごとの増幅率ｇａｉｎ_B［ｊ］の算出方法は、第１実施形態にて述べたそれと同じである。増幅率ｇａｉｎ_B［ｊ］の算出はフレームごとに行われる。目標レベルＴ＿ｌｅｖは、音量制御部３２から出力される各分割帯域についての音響信号の音量が一定音量となるように決定した値であり、目標レベルＴ＿ｌｅｖを所望の値に予め設定しておくことができる。例えば、１６ビットのフルスケールの−２０ｄＢを目標レベルＴ＿ｌｅｖとして設定することができる。尚、音響信号の音量は音響信号のパワーに依存するため、音量解析部３５にて分割帯域ごとにＩＭＤＣＴ部３３の出力音響信号の音量が解析されると言える。 The calculation method of the amplification factor gain _B [j] for each divided band is the same as that described in the first embodiment. The amplification factor gain _B [j] is calculated for each frame. The target level T_lev is a value determined so that the volume of the acoustic signal for each divided band output from the volume control unit 32 is a constant volume, and the target level T_lev may be set in advance to a desired value. it can. For example, -20 dB of 16-bit full scale can be set as the target level T_lev. Since the volume of the acoustic signal depends on the power of the acoustic signal, it can be said that the volume of the output acoustic signal of the IMDCT section 33 is analyzed for each divided band by the volume analysis section 35.

ＩＭＤＣＴ部３３の処理単位ごとにｉｍｄｃｔ_IN［ｆ］からｉｍｄｃｔ_OUT［ｉ］が次々と生成されることで、デコード前処理部３１に供給される符号化音響信号のフレーム分だけのｐｃｍ［ｉ］が次々と得られる。音響信号ｐｃｍ［ｉ］を音声出力回路等を介してスピーカに供給することで、音響信号ｐｃｍ［ｉ］を音として再生することができる。 By generating imctt _OUT [i] one after another from imcdt _IN [f] for each processing unit of the IMDCT unit 33, pcm [i] corresponding to the frame of the encoded acoustic signal supplied to the pre-decoding processing unit 31 is generated. Are obtained one after another. By supplying the acoustic signal pcm [i] to the speaker via an audio output circuit or the like, the acoustic signal pcm [i] can be reproduced as sound.

次々とデコード前処理部３１に供給される符号化音響信号から再生用の音響信号ｐｃｍ［ｉ］をリアルタイムに生成するべく、今回フレームの音響信号ｉｍｄｃｔ_IN［ｆ］に対するｇａｉｎ_B［ｊ］は、過去フレーム（例えば、前回又は前々回フレーム）の音響信号ｐｃｍ［ｉ］から生成される。即ち、過去フレームの音響信号ｐｃｍ［ｉ］の音量を解析することによって得た増幅率ｇａｉｎ_B［ｊ］を用いて、今回フレームの音響信号ｉｍｄｃｔ_IN［ｆ］の音量が制御される。１フレームは数十ミリ秒であるため、今回フレームの音響信号に対する音量制御を、前回又は前々回フレームの音響信号に基づいて行ったとしても聴覚的な影響は殆どない。 In order to generate the acoustic signal pcm [i] for reproduction from the encoded acoustic signal supplied one after another to the pre-decoding processing unit 31 in real time, the gain _B [j] for the acoustic signal imdct _IN [f] of the current frame is It is generated from the acoustic signal pcm [i] of the past frame (for example, the previous or previous frame). That is, the volume of the sound signal imdct _IN [f] of the current frame is controlled using the amplification factor gain _B [j] obtained by analyzing the volume of the sound signal pcm [i] of the past frame. Since one frame is several tens of milliseconds, even if the volume control for the sound signal of the current frame is performed based on the sound signal of the previous or previous frame, there is almost no audible effect.

但し、今回フレームの音響信号ｉｍｄｃｔ_IN［ｆ］に対するｇａｉｎ_B［ｊ］を、今回フレームの音響信号ｐｃｍ［ｉ］から生成することも可能である。この場合、一旦、ｇａｉｎ_B［０］〜ｇａｉｎ_B［７］を全て１に設定した上で今回フレームのｐｃｍ［ｉ］を得た後、そのｐｃｍ［ｉ］に基づいて今回フレームに対するｇａｉｎ_B［０］〜ｇａｉｎ_B［７］を再設定し、再設定したｇａｉｎ_B［０］〜ｇａｉｎ_B［７］を用いて今回フレームについてのｉｍｄｃｔ_IN［ｆ］の音量制御を行えば良い。 However, it is also possible to generate gain _B [j] for the acoustic signal imdct _IN [f] of the current frame from the acoustic signal pcm [i] of the current frame. In this case, once, after obtaining the gain _{_B} [0] ~gain _B of the current frame in terms of setting the [7] to all 1 pcm [i], gain _B for the current frame based on the pcm [i] [ 0] to gain _B [7] are reset, and the volume control of imdt _IN [f] for the current frame may be performed using the reset gain _B [0] to gain _B [7].

音量制御装置３０の構成要素には、音量制御部３２及び音量解析部３５が含まれる。デコード前処理部３１及び／又はＩＭＤＣＴ部３３は、音量制御装置３０の構成要素に含まれると解釈しても良いし、音量制御装置３０の構成要素に含まれないと解釈しても良い。 The components of the volume control device 30 include a volume control unit 32 and a volume analysis unit 35. The decoding preprocessing unit 31 and / or the IMDCT unit 33 may be interpreted as being included in the components of the volume control device 30 or may be interpreted as not being included in the components of the volume control device 30.

図８に、音量制御装置３０を適用可能なＡＡＣデコーダ４０の概略構成図を示す。ＡＡＣデコーダ４０には、デコード前処理部３１と同じ機能を有するデコード前処理部３１ａ及びＩＭＤＣＴ部３３と同じ機能を有するＩＭＤＣＴ部３３ａが備えられている。ＡＡＣデコーダ４０を音量制御装置３０と組み合わせて用いる場合、デコード前処理部３１ａに符号化音響信号を供給することによってデコード前処理部３１ａから出力される音響信号ｉｍｄｃｔ_IN［ｆ］を音量制御部３２に入力し、これによって音量制御部３２から出力される音響信号ｉｍｄｃｔ_IN［ｆ］’をＩＭＤＣＴ部３３ａに与える。そして、ＩＭＤＣＴ部３３ａの出力音響信号ｐｃｍ［ｉ］を音量解析部３５に供給すればよい。 FIG. 8 shows a schematic configuration diagram of an AAC decoder 40 to which the volume control device 30 can be applied. The AAC decoder 40 includes a decoding preprocessing unit 31 a having the same function as the decoding preprocessing unit 31 and an IMDCT unit 33 a having the same function as the IMDCT unit 33. When the AAC decoder 40 is used in combination with the sound volume control device 30, the sound signal imdct _IN [f] output from the pre-decoding processing unit 31a is supplied to the sound volume control unit 32 by supplying the encoded sound signal to the pre-decoding processing unit 31a. And the acoustic signal imdct _IN [f] ′ output from the volume control unit 32 is given to the IMDCT unit 33a. Then, the output acoustic signal pcm [i] of the IMDCT unit 33a may be supplied to the volume analysis unit 35.

本実施形態では、復号化器のフィルタバンク（ＩＭＤＣＴ部３３）に入力される周波数領域上の音響信号が帯域別音量制御の対象とされる。このため、図１５に示す従来構成に示されるような複数のバンドパスフィルタは不要であり、小規模のソフトウェア処理又は回路の追加にて帯域別音量制御を実現することが可能となる。 In this embodiment, an acoustic signal in the frequency domain that is input to the filter bank (IMDCT unit 33) of the decoder is the target of the volume control for each band. For this reason, a plurality of band-pass filters as shown in the conventional configuration shown in FIG. 15 are not necessary, and it is possible to realize band-specific volume control by adding small-scale software processing or a circuit.

＜＜第３実施形態＞＞
本発明の第３実施形態を説明する。第３実施形態においても、第２実施形態と同様、本発明をＡＡＣデコーダに適用することを想定する。復号化器の一種であるＡＡＣデコーダは、ＡＡＣエンコーダにおいて生成された符号化音響信号を所定の復号化方式にて復号し、符号化の成される前の音響信号を生成する。 << Third Embodiment >>
A third embodiment of the present invention will be described. Also in the third embodiment, it is assumed that the present invention is applied to an AAC decoder as in the second embodiment. An AAC decoder, which is a kind of decoder, decodes an encoded acoustic signal generated by an AAC encoder by a predetermined decoding method, and generates an acoustic signal before being encoded.

本実施形態では、符号化音響信号がモノラルの音響信号であることを想定する。ＡＡＣデコーダでは、フィルタバンクにおける処理としてＩＭＤＣＴ（逆変形離散コサイン変換）が用いられる。ＡＡＣデコーダにおいて、ＩＭＤＣＴの処理単位は時間領域上の信号に対して２０４８サンプル又は２５６サンプルであるが、本実施形態では、説明の簡略化上、該処理単位が２０４８サンプルで固定されているものとする。 In the present embodiment, it is assumed that the encoded sound signal is a monaural sound signal. In the AAC decoder, IMDCT (Inverse Modified Discrete Cosine Transform) is used as processing in the filter bank. In the AAC decoder, the IMDCT processing unit is 2048 samples or 256 samples with respect to the signal in the time domain, but in this embodiment, the processing unit is fixed at 2048 samples for simplicity of explanation. To do.

図９は、第３実施形態に係る音量制御装置５０の構成を表すブロック図である。本実施形態にて述べられるｉｍｄｃｔ_IN［ｆ］、ｉｍｄｃｔ_IN［ｆ］’及びｐｃｍ［ｉ］は、夫々、デコード前処理部５１、正規化部５２及びＩＭＤＣＴ部５３から出力される音響信号であって、本実施形態にて述べられるｇａｉｎ_A及びｇａｉｎ_Bは、夫々、正規化部５２及び音量解析部５４にて算出される増幅率である。 FIG. 9 is a block diagram illustrating a configuration of a volume control device 50 according to the third embodiment. Imdct _IN [f], imdct _IN [f] ′, and pcm [i] described in this embodiment are acoustic signals output from the pre-decoding unit 51, the normalization unit 52, and the IMDCT unit 53, respectively. The gain _A and the gain _B described in the present embodiment are amplification factors calculated by the normalization unit 52 and the sound volume analysis unit 54, respectively.

ＡＡＣエンコーダによって符号化された音響信号である符号化音響信号は、図示されない記録媒体から読み出されてデコード前処理部５１に与えられる。デコード前処理部５１は、符号化音響信号を復号することで、ＡＡＣデコーダのフィルタバンクに相当するＩＭＤＣＴ部５３への入力信号を生成して出力する。但し、デコード前処理部５１の出力信号は正規化部５２にて信号レベルが正規化されてからＩＭＤＣＴ部５３に供給される。 An encoded acoustic signal that is an acoustic signal encoded by the AAC encoder is read from a recording medium (not shown) and provided to the pre-decoding processing unit 51. The decoding preprocessing unit 51 generates and outputs an input signal to the IMDCT unit 53 corresponding to the filter bank of the AAC decoder by decoding the encoded acoustic signal. However, the output signal of the decoding preprocessing unit 51 is supplied to the IMDCT unit 53 after the signal level is normalized by the normalization unit 52.

デコード前処理部５１から出力される１処理単位分の音響信号は、１０２４個のｉｍｄｃｔ_IN［ｆ］により形成される。第１実施形態と同様、ｆは不等式「０≦ｆ≦１０２３」を満たす整数であるとする。デコード前処理部５１の出力音響信号は周波数領域上の音響信号であって、ｉｍｄｃｔ_IN［ｆ］は、デコード前処理部５１の出力音響信号の、第ｆ番目のサブ帯域における信号強度を表している。第ｆ番目のサブ帯域の定義は、第１実施形態にて述べたものと同様である。 The acoustic signal for one processing unit output from the decoding preprocessing unit 51 is formed by 1024 imdct _IN [f]. As in the first embodiment, f is an integer that satisfies the inequality “0 ≦ f ≦ 1023”. The output acoustic signal of the decoding preprocessing unit 51 is an acoustic signal in the frequency domain, and imctt _IN [f] represents the signal intensity of the output acoustic signal of the decoding preprocessing unit 51 in the f-th sub-band. Yes. The definition of the f-th sub-band is the same as that described in the first embodiment.

正規化部５２は、デコード前処理部５１から出力される１処理単位分の音響信号ｉｍｄｃｔ_IN［０］〜ｉｍｄｃｔ_IN［１０２３］の内の最大値を検出し、その検出最大値と、１６ビットで表現可能な最大のデジタル値（即ち６５５３５）とを一致させるために必要な増幅率ｇａｉｎ_Aを算出する。そして、注目した処理単位についての音響信号ｉｍｄｃｔ_IN［０］〜ｉｍｄｃｔ_IN［１０２３］の全てを増幅率ｇａｉｎ_Aにて増幅する。ｉｍｄｃｔ_IN［ｆ］を増幅率ｇａｉｎ_Aにて増幅したものを、ｉｍｄｃｔ_IN［ｆ］’にて表す。正規化部５２では、次式（Ｃ１）に従ってｉｍｄｃｔ_IN［ｆ］からｉｍｄｃｔ_IN［ｆ］’を生成する。
ｉｍｄｃｔ_IN［ｆ］’＝ｉｍｄｃｔ_IN［ｆ］×ｇａｉｎ_A ・・・（Ｃ１） The normalizing unit 52 detects the maximum value among the acoustic signals imdct _IN [0] to imdct _IN [1023] for one processing unit output from the pre-decoding unit 51, and the detected maximum value and 16 bits. The gain gain _A required to match the maximum digital value that can be expressed by (i.e., 65535) is calculated. Then, all of the acoustic signals imdct _IN [0] to imdct _IN [1023] for the processing unit of interest are amplified by the amplification factor gain _A. A product obtained by amplifying imctt _IN [f] at an amplification factor gain _{A is} represented by imdct _IN [f] ′. The normalization unit 52, generates a imdct _IN [f] 'from imdct _IN [f] according to the following formula (C1).
imdct _IN [f] ′ = imdct _IN [f] × gain _A (C1)

即ち例えば、ｉｍｄｃｔ_IN［０］〜ｉｍｄｃｔ_IN［１０２３］の内の最大値がｉｍｄｃｔ_IN［２００］である場合、等式「ｇａｉｎ_A＝６５５３５／ｉｍｄｃｔ_IN［２００］」に従って増幅率ｇａｉｎ_Aを算出し、増幅率ｇａｉｎ_Aをｉｍｄｃｔ_IN［ｆ］に乗じることでｉｍｄｃｔ_IN［ｆ］’を導出する。このように、正規化部５２は、ＡＡＣデコーダに供給される入力音響信号の信号レベル（より詳しくは、ＡＡＣデコーダのフィルタバンクに相当するＩＭＤＣＴ部５３に供給される入力音響信号の信号レベル）を正規化する。上述の説明から明らかなように、この正規化は、注目処理単位内の最大信号値を所定の目標値と一致させるための正規化である。該目標値が、１６ビットで表現可能な最大のデジタル値（即ち６５５３５）であることを例示しているが、該目標値の設定方法はそれに限定されない。正規化部５２の出力信号、即ち、正規化部５２による正規化後の入力音響信号ｉｍｄｃｔ_IN［ｆ］’は、ＩＭＤＣＴ部５３に与えられる。 That is, for example, when the maximum value of imdct _IN [0] to imdct _IN [1023] is imdct _IN [200], the amplification factor gain _A is calculated according to the equation “gain _A = 65535 / imdct _IN [200]”. Then, imctt _IN [f] ′ is derived by multiplying imctt _IN [f] by the amplification factor gain _A. Thus, the normalization unit 52 determines the signal level of the input acoustic signal supplied to the AAC decoder (more specifically, the signal level of the input acoustic signal supplied to the IMDCT unit 53 corresponding to the filter bank of the AAC decoder). Normalize. As is clear from the above description, this normalization is a normalization for making the maximum signal value in the processing unit of interest coincide with a predetermined target value. Although the target value is exemplified as the maximum digital value that can be expressed by 16 bits (that is, 65535), the method for setting the target value is not limited thereto. The output signal of the normalization unit 52, that is, the input acoustic signal imdct _IN [f] ′ after normalization by the normalization unit 52 is given to the IMDCT unit 53.

ＩＭＤＣＴ部５３は、ＡＡＣデコーダにおける帯域合成フィルタバンクとして機能し、各サブ帯域の音響信号ｉｍｄｃｔ_IN［ｆ］’を合成して時間領域上の音響信号を生成する。即ち、ＩＭＤＣＴ部５３は、処理単位ごとに、正規化部５２から与えられる音響信号ｉｍｄｃｔ_IN［ｆ］’に対して逆変形離散コサイン変換を行うことで、ｉｍｄｃｔ_IN［ｆ］’によって表される周波数領域上の音響信号を時間領域上の音響信号ｉｍｄｃｔ_OUT［ｉ］に変換する。ここで、ｉは０以上且つ２０４７以下の整数をとる。ＩＭＤＣＴ部５３において、１処理単位分のｉｍｄｃｔ_IN［ｆ］’（即ち、ｉｍｄｃｔ_IN［０］’〜ｉｍｄｃｔ_IN［１０２３］’）から、２０４８サンプル分のｉｍｄｃｔ_OUT［ｉ］（即ち、ｉｍｄｃｔ_OUT［０］〜ｉｍｄｃｔ_OUT［２０４７］）が得られる。更に、ＩＭＤＣＴ部５３は、窓関数処理及びオーバーラップ処理を介して、時間的に隣接する２つの処理単位についてのｉｍｄｃｔ_OUT［ｉ］から１０２４サンプル分の音響信号ｐｃｍ［ｉ］（即ち、ｐｃｍ［０］〜ｐｃｍ［１０２３］）を生成する。この生成方法は、第２実施形態にて述べたものと同じである。 The IMDCT unit 53 functions as a band synthesis filter bank in the AAC decoder, and generates an acoustic signal in the time domain by synthesizing the acoustic signals imdct _IN [f] ′ of each subband. That is, the IMDCT unit 53 performs inverse transformation discrete cosine transform on the acoustic signal imdct _IN [f] ′ given from the normalization unit 52 for each processing unit, and is expressed by imdct _IN [f] ′. An acoustic signal on the frequency domain is converted into an acoustic signal imdct _OUT [i] on the time domain. Here, i takes an integer of 0 or more and 2047 or less. In the IMDCT unit 53, imtct _IN [f] ′ for one processing unit (that is, imdct _IN [0] ′ to imdct _IN [1023] ′) is used to generate 2048 samples of imdct _OUT [i] (that is, imdct _OUT [ 0] to immdct _OUT [2047]). Further, the IMDCT unit 53 performs an acoustic signal pcm [i] (i.e., pcm [i] for 1024 samples from imdct _OUT [i] for two processing units adjacent in time through window function processing and overlap processing. 0] to pcm [1023]). This generation method is the same as that described in the second embodiment.

音量解析部５４は、１０２４サンプル分の音響信号ｐｃｍ［ｉ］の中の最大値ｐｃｍ_MAXを検出し、音量制御部５５による音量制御によって（ｐｃｍ_MAX／ｇａｉｎ_A）が目標レベルＴ＿ｌｅｖとなるように増幅率ｇａｉｎ_Bを算出する。即ち、次式（Ｃ２）に従って増幅率ｇａｉｎ_Bを算出する。増幅率ｇａｉｎ_Bは、１０２４サンプル分の音響信号ｐｃｍ［ｉ］ごとに算出される。
ｇａｉｎ_B＝（Ｔ＿ｌｅｖ×ｇａｉｎ_A）／ｐｃｍ_MAX ・・・（Ｃ２） The volume analysis unit 54 detects the maximum value pcm _MAX in the acoustic signal pcm [i] for 1024 samples, and (pcm _MAX / gain _A ) is set to the target level T_lev by volume control by the volume control unit 55. The amplification factor gain _B is calculated. That is, the amplification factor gain _B is calculated according to the following equation (C2). The amplification factor gain _B is calculated for each acoustic signal pcm [i] for 1024 samples.
gain _B = (T_lev × gain _A ) / pcm _MAX (C2)

目標レベルＴ＿ｌｅｖは、音量制御部５５から出力される音響信号の音量が一定音量となるように決定した値であり、目標レベルＴ＿ｌｅｖを所望の値に予め設定しておくことができる。例えば、１６ビットのフルスケールの−６ｄＢを目標レベルＴ＿ｌｅｖとして設定することができる。尚、音響信号の音量は音響信号の信号レベル（ｐｃｍ［ｉ］の値）に依存するため、音量解析部５４にてＩＭＤＣＴ部５３の出力音響信号の音量が解析されると言える。 The target level T_lev is a value determined so that the volume of the acoustic signal output from the volume control unit 55 becomes a constant volume, and the target level T_lev can be set to a desired value in advance. For example, 16-bit full scale −6 dB can be set as the target level T_lev. Since the volume of the acoustic signal depends on the signal level of the acoustic signal (value of pcm [i]), it can be said that the volume of the output acoustic signal of the IMDCT section 53 is analyzed by the volume analysis section 54.

音量制御部５５は、正規化情報としての増幅率ｇａｉｎ_A及び増幅率ｇａｉｎ_Bに基づく増幅率（ｇａｉｎ_B／ｇａｉｎ_A）にてｐｃｍ［ｉ］を増幅することで、ＩＭＤＣＴ部５３の出力音響信号の音量を制御する。音量制御部５５による音量制御後の信号ｐｃｍ［ｉ］をｐｃｍ［ｉ］’にて表すと、
ｐｃｍ［ｉ］’＝ｐｃｍ［ｉ］×ｇａｉｎ_B／ｇａｉｎ_A
である。或る注目した処理単位の音響信号ｉｍｄｃｔ_IN［ｆ］に基づく増幅率（ｇａｉｎ_B／ｇａｉｎ_A）を用い、その注目した処理単位についてのｐｃｍ［０］〜ｐｃｍ［１０２３］が増幅される。 The volume control unit 55 amplifies pcm [i] with an amplification factor (gain _B / gain _A ) based on the amplification factor gain _A and the amplification factor gain _B as normalization information, thereby outputting an output acoustic signal of the IMDCT unit 53 Control the volume of the. When the signal pcm [i] after the volume control by the volume control unit 55 is represented by pcm [i] ′,
pcm [i] ′ = pcm [i] × gain _B / gain _A
It is. Using the amplification factor (gain _B / gain _A ) based on the acoustic signal imdct _IN [f] of a certain processing unit, pcm [0] to pcm [1023] for the processing unit of interest are amplified.

ＩＭＤＣＴ部５３の処理単位ごとにｉｍｄｃｔ_IN［ｆ］からｉｍｄｃｔ_OUT［ｉ］が次々と生成されることで、デコード前処理部５１に供給される符号化音響信号のフレーム分だけのｐｃｍ［ｉ］’が次々と得られる。音響信号ｐｃｍ［ｉ］’を音声出力回路等を介してスピーカに供給することで、音響信号ｐｃｍ［ｉ］’を音として再生することができる。 By generating imctt _OUT [i] one after another from imdt _IN [f] for each processing unit of the IMDCT unit 53, pcm [i] corresponding to the frame of the encoded acoustic signal supplied to the pre-decoding processing unit 51 is generated. 'Is obtained one after another. By supplying the acoustic signal pcm [i] ′ to the speaker via an audio output circuit or the like, the acoustic signal pcm [i] ′ can be reproduced as sound.

尚、図９の構成では、増幅率（ｇａｉｎ_B／ｇａｉｎ_A）の算出を音量解析部５４において行って、求めた増幅率（ｇａｉｎ_B／ｇａｉｎ_A）を音量解析部５４から音量制御部５５に与えるようにしているが、正規化部５２から出力されるｇａｉｎ_A及び音量解析部５４から出力されるｇａｉｎ_Bに基づき、音量制御部５５において増幅率（ｇａｉｎ_B／ｇａｉｎ_A）の算出を行うようにしても良い。 In the configuration of FIG. 9, the amplification factor (gain _B / gain _A ) is calculated in the volume analysis unit 54, and the obtained amplification factor (gain _B / gain _A ) is transferred from the volume analysis unit 54 to the volume control unit 55. However, based on the gain _A output from the normalization unit 52 and the gain _B output from the volume analysis unit 54, the volume control unit 55 calculates the amplification factor (gain _B / gain _A ). Anyway.

音量制御装置５０の構成要素には、正規化部５２、音量解析部５４及び音量制御部５５が含まれる。デコード前処理部５１及び／又はＩＭＤＣＴ部５３は、音量制御装置５０の構成要素に含まれると解釈しても良いし、音量制御装置５０の構成要素に含まれないと解釈しても良い。 The components of the volume control device 50 include a normalization unit 52, a volume analysis unit 54, and a volume control unit 55. The decoding preprocessing unit 51 and / or the IMDCT unit 53 may be interpreted as being included in the components of the volume control device 50 or may be interpreted as not being included in the components of the volume control device 50.

図１０に、音量制御装置５０を適用可能なＡＡＣデコーダ６０の概略構成図を示す。ＡＡＣデコーダ６０には、デコード前処理部５１と同じ機能を有するデコード前処理部５１ａ及びＩＭＤＣＴ部５３と同じ機能を有するＩＭＤＣＴ部５３ａが備えられている。ＡＡＣデコーダ６０を音量制御装置５０と組み合わせて用いる場合、デコード前処理部５１ａに符号化音響信号を供給することによってデコード前処理部５１ａから出力される音響信号ｉｍｄｃｔ_IN［ｆ］を正規化部５２に入力し、これによって正規化部５２から出力される音響信号ｉｍｄｃｔ_IN［ｆ］’をＩＭＤＣＴ部５３ａに与える。そして、ＩＭＤＣＴ部５３ａの出力音響信号ｐｃｍ［ｉ］を音量解析部５４及び音量制御部５５に与えることより、音量制御部５５にて音響信号ｐｃｍ［ｉ］’を生成するようにすればよい。 FIG. 10 shows a schematic configuration diagram of an AAC decoder 60 to which the volume control device 50 can be applied. The AAC decoder 60 includes a decoding preprocessing unit 51 a having the same function as the decoding preprocessing unit 51 and an IMDCT unit 53 a having the same function as the IMDCT unit 53. When the AAC decoder 60 is used in combination with the volume control device 50, the normalization unit 52 converts the acoustic signal imdct _IN [f] output from the decoding preprocessing unit 51a by supplying the encoded acoustic signal to the decoding preprocessing unit 51a. And the acoustic signal imdct _IN [f] ′ output from the normalization unit 52 is given to the IMDCT unit 53a. Then, the sound signal pcm [i] ′ may be generated by the sound volume control unit 55 by giving the sound signal pcm [i] output from the IMDCT unit 53a to the sound volume analysis unit 54 and the sound volume control unit 55.

図９に示す如く、復号化器のフィルタバンク（ＩＭＤＣＴ部５３）の後段にて帯域別音量制御を行う場合、小さい音の状態でフィルタバンク処理（即ち、ＩＭＤＣＴ）が行われることがある。小さい音の状態でフィルタバンク処理を行うと、フィルタバンク処理における演算誤差（丸め誤差や桁落ち等）の影響が大きくなる。演算誤差を多く含む信号をフィルタバンクの後段で増幅すると音質の劣化が大きくなる。これを考慮し、音量制御装置５０では、復号化器のフィルタバンク（ＩＭＤＣＴ部５３）への入力音響信号の信号レベルを正規化部５２で正規化、即ち増幅し、増幅後の音響信号をフィルタバンクに入力する。これにより、上記演算誤差の影響が軽減されて音声劣化が抑制される。 As shown in FIG. 9, in the case where the band-specific volume control is performed after the decoder filter bank (IMDCT unit 53), the filter bank processing (that is, IMDCT) may be performed in a low sound state. When the filter bank processing is performed in a state where the sound is low, the influence of calculation errors (rounding error, digit loss, etc.) in the filter bank processing increases. When a signal containing a large amount of calculation error is amplified at the subsequent stage of the filter bank, the sound quality deteriorates. Considering this, the volume control device 50 normalizes, that is, amplifies, the signal level of the input acoustic signal to the filter bank (IMDCT unit 53) of the decoder by the normalizing unit 52, and filters the amplified acoustic signal. Enter into the bank. Thereby, the influence of the said calculation error is reduced and voice deterioration is suppressed.

尚、上述の説明では、音量制御部５５における制御量に相当する（ｇａｉｎ_B／ｇａｉｎ_A）を処理単位ごとに（即ち、時間領域上の１０２４サンプルごとに）独立に決定しているが、時間的に隣接する処理単位間における音量変化が緩やかとなるように該制御量に過渡特性を持たせることとも可能であるし、（ｇａｉｎ_B／ｇａｉｎ_A）が所定範囲内に収まるように該制御量に制約を課すようにしてもよい。 In the above description, (gain _B / gain _A ) corresponding to the control amount in the volume control unit 55 is determined independently for each processing unit (that is, for every 1024 samples in the time domain). It is also possible to give the control amount a transient characteristic so that the volume change between adjacent processing units becomes moderate, or the control amount so that (gain _B / gain _A ) is within a predetermined range. You may make it impose restrictions.

＜＜第４実施形態＞＞
本発明の第４実施形態を説明する。第４実施形態では、第１〜第３実施形態にて述べた音量制御装置を応用した機器を例示する。第１〜第３実施形態の何れかに記載の音量制御装置を、音響信号処理装置を備えた任意の電子機器に適用することできる。該電子機器には、録音装置（ＩＣレコーダなど）、音響信号再生装置及び撮像装置などが含まれる。尚、撮像装置において、録音装置としての機能若しくは音響信号再生装置としての機能又はそれらの双方の機能を実現することも可能である。また、録音装置、音響信号再生装置又は撮像装置は、携帯端末（携帯電話機等）に組み込まれうる。 << Fourth Embodiment >>
A fourth embodiment of the present invention will be described. In 4th Embodiment, the apparatus which applied the volume control apparatus described in 1st-3rd embodiment is illustrated. The volume control device according to any one of the first to third embodiments can be applied to any electronic device including an acoustic signal processing device. The electronic device includes a recording device (such as an IC recorder), an acoustic signal reproducing device, and an imaging device. In the imaging device, it is also possible to realize a function as a recording device, a function as a sound signal reproducing device, or both of them. In addition, the recording device, the sound signal reproducing device, or the imaging device can be incorporated in a mobile terminal (such as a mobile phone).

例として、図１１に、録音装置１００の概略構成図を示す。録音装置１００は、録音装置１００の周辺音を音響信号に変換して出力するマイク部１０１と、音響信号処理装置１０２と、磁気ディスクや半導体メモリ等から成る記録媒体１０３と、を備える。音響信号処理装置１０２に、第１実施形態に係る音量制御装置１０を含めておくことができる或いは音量制御装置１０及びＡＡＣエンコーダ２０を含めておくことができる（図１及び図４参照）。音響信号処理装置１０２は、マイク部１０１から出力される音響信号を入力音響信号ｐｃｍ［ｉ］として取り扱って符号化音響信号を生成し、それを図１の記録媒体１７としての記録媒体１０３に記録させることができる As an example, FIG. 11 shows a schematic configuration diagram of the recording apparatus 100. The recording device 100 includes a microphone unit 101 that converts peripheral sound of the recording device 100 into an acoustic signal and outputs the acoustic signal, an acoustic signal processing device 102, and a recording medium 103 including a magnetic disk, a semiconductor memory, and the like. The sound signal processing apparatus 102 can include the volume control apparatus 10 according to the first embodiment, or can include the volume control apparatus 10 and the AAC encoder 20 (see FIGS. 1 and 4). The acoustic signal processing apparatus 102 handles the acoustic signal output from the microphone unit 101 as the input acoustic signal pcm [i], generates an encoded acoustic signal, and records it on the recording medium 103 as the recording medium 17 in FIG. Can be

また、図１２に、音響信号再生装置１２０の概略構成図を示す。音響信号再生装置１２０は、音響信号処理装置１２１と、磁気ディスクや半導体メモリ等から成る記録媒体１２２と、スピーカ部１２３と、を備える。記録媒体１２２には、符号化音響信号が記録されているものとする。音響信号処理装置１２１に、第２又は第３実施形態に係る音量制御装置３０又は５０を含めておくことができる、或いは、音量制御装置３０及びＡＡＣエンコーダ４０を含めておくことができる、或いは、音量制御装置５０及びＡＡＣエンコーダ６０を含めておくことができる（図６、図８〜図１０参照）。 FIG. 12 shows a schematic configuration diagram of the acoustic signal reproducing device 120. The acoustic signal reproducing device 120 includes an acoustic signal processing device 121, a recording medium 122 composed of a magnetic disk, a semiconductor memory, and the like, and a speaker unit 123. It is assumed that a coded acoustic signal is recorded on the recording medium 122. The sound signal processing device 121 can include the volume control device 30 or 50 according to the second or third embodiment, or can include the volume control device 30 and the AAC encoder 40, or The volume control device 50 and the AAC encoder 60 can be included (see FIGS. 6 and 8 to 10).

音響信号処理装置１２１は、記録媒体１２２から読み出された符号化音響信号に基づき、音響信号処理装置１２１内のＩＭＤＣＴ部３３（図６参照）から音響信号ｐｃｍ［ｉ］を出力させる、或いは、音響信号処理装置１２１内の音量制御部５５（図９参照）から音響信号ｐｃｍ［ｉ］’を出力させる。そして、音響信号ｐｃｍ［ｉ］又はｐｃｍ［ｉ］’をスピーカ部１２３にて再生することができる。 The acoustic signal processing device 121 outputs the acoustic signal pcm [i] from the IMDCT unit 33 (see FIG. 6) in the acoustic signal processing device 121 based on the encoded acoustic signal read from the recording medium 122, or The sound signal pcm [i] ′ is output from the volume control unit 55 (see FIG. 9) in the sound signal processing device 121. Then, the acoustic signal pcm [i] or pcm [i] ′ can be reproduced by the speaker unit 123.

更に、図１３に、撮像装置１４０の概略構成図を示す。撮像装置１４０は、図１１の録音装置１００の構成要素に、ＣＣＤ（Charge Coupled Device）又はＣＭＯＳ（Complementary Metal Oxide Semiconductor）イメージセンサなどから成る撮像素子１４４、撮像素子１４４を用いた撮影によって得られた画像に所定の画像処理を施す画像処理部１４５、撮影画像の表示を行う表示部１４６及び音声出力を行うスピーカ部１４７等を付加することによって形成される。撮像装置１４０に設けられる、マイク部１０１、音響信号処理装置１０２及び記録媒体１０３は、録音装置１００のそれらと同じものである。 Further, FIG. 13 shows a schematic configuration diagram of the imaging apparatus 140. The image pickup device 140 is obtained by photographing using the image pickup device 144 and the image pickup device 144 that are CCD (Charge Coupled Device) or CMOS (Complementary Metal Oxide Semiconductor) image sensors as the components of the recording device 100 of FIG. An image processing unit 145 that performs predetermined image processing on the image, a display unit 146 that displays captured images, a speaker unit 147 that outputs audio, and the like are added. The microphone unit 101, the acoustic signal processing device 102, and the recording medium 103 provided in the imaging device 140 are the same as those of the recording device 100.

撮像装置１４０は、撮像素子１４４を用いて被写体に応じた動画像又は静止画像を撮影する。その動画像又は静止画像を表す画像信号（例えば、ＹＵＶ形式の映像信号）は、画像処理部１４５を介して記録媒体１０３に記録される。特に、動画像の撮影時においては、マイク部１０１の出力音響信号に基づく符号化音響信号（音響信号処理装置１０２内の音量制御装置１０にて生成された符号化音響信号）と動画像の画像信号とが時間的に関連付けられた上で記録媒体１０３に記録される。尚、撮像装置１４０において、音響信号の記録段階ではなく音響信号の再生段階で、第２又は第３実施形態にて述べた音量制御を実施するようにしてもよい。 The imaging device 140 captures a moving image or a still image corresponding to the subject using the imaging element 144. An image signal representing the moving image or still image (for example, a YUV video signal) is recorded on the recording medium 103 via the image processing unit 145. In particular, at the time of moving image shooting, an encoded sound signal (encoded sound signal generated by the volume control device 10 in the sound signal processing device 102) based on an output sound signal of the microphone unit 101 and a moving image image are recorded. The signal is recorded on the recording medium 103 after being temporally related. Note that in the imaging device 140, the volume control described in the second or third embodiment may be performed not in the recording stage of the acoustic signal but in the reproduction stage of the acoustic signal.

＜＜変形等＞＞
上述の実施形態の変形例または注釈事項として、以下に、注釈１〜注釈４を記す。各注釈に記載した内容は、矛盾なき限り、任意に組み合わせることが可能である。 << Deformation, etc. >>
As modifications or annotations of the above-described embodiment, notes 1 to 4 are described below. The contents described in each comment can be arbitrarily combined as long as there is no contradiction.

［注釈１］
上述した説明文中に示した具体的な数値は、単なる例示であって、当然の如く、それらを様々な数値に変更することができる。例えば、上述の各実施形態では、ＭＤＣＴにおける時間領域上の信号の処理単位が２０４８サンプルにて固定されていることを想定したが、該処理単位のサンプル数は２０４８以外にもなりうる。 [Note 1]
The specific numerical values shown in the above description are merely examples, and as a matter of course, they can be changed to various numerical values. For example, in each of the above-described embodiments, it is assumed that the processing unit of the signal in the time domain in MDCT is fixed at 2048 samples. However, the number of samples of the processing unit may be other than 2048.

［注釈２］
上述の各実施形態では、説明の簡略化上、音量制御の対象となる音響信号がモノラルの音響信号であることを想定したが、音量制御の対象となる音響信号は、複数チャンネル分の音響信号から成るステレオ又はマルチチャンネルの音響信号であっても良い。チャンネルごとに独立して上述してきた音量制御を行うようにしても良いし、複数チャンネル分の音響信号の音量を総合的に制御するようにしても良い。 [Note 2]
In each of the above-described embodiments, for the sake of simplicity of explanation, it is assumed that the sound signal to be subjected to volume control is a monaural sound signal, but the sound signal to be subjected to volume control is an acoustic signal for a plurality of channels. It may be a stereo or multi-channel audio signal consisting of The volume control described above may be performed independently for each channel, or the volume of acoustic signals for a plurality of channels may be comprehensively controlled.

［注釈３］
本発明の音量制御装置に適用することのできる符号化方式及び復号化方式は、ＡＡＣ（Advanced Audio Coding）に従うもの以外であっても構わない。 [Note 3]
An encoding method and a decoding method that can be applied to the sound volume control apparatus of the present invention may be other than those according to AAC (Advanced Audio Coding).

［注釈４］
本発明に係る音量制御装置（１０、３０又は５０）によって実現される機能の全部又は一部は、ハードウェア、ソフトウェア、或いは、ハードウェアとソフトウェアの組み合わせによって実現可能である。ソフトウェアを用いて音量制御装置（１０、３０又は５０）を構成する場合、ソフトウェアにて実現される部位についてのブロック図は、その部位の機能ブロック図を表すことになる。音量制御装置（１０、３０又は５０）にて実現される機能の全部または一部を、プログラムとして記述し、該プログラムをプログラム実行装置（例えばコンピュータ）上で実行することによって、その機能の全部または一部を実現するようにしてもよい。 [Note 4]
All or part of the functions realized by the volume control device (10, 30 or 50) according to the present invention can be realized by hardware, software, or a combination of hardware and software. When the volume control device (10, 30 or 50) is configured using software, a block diagram of a part realized by software represents a functional block diagram of the part. All or part of the functions realized by the sound volume control device (10, 30 or 50) is described as a program, and the program is executed on a program execution device (for example, a computer), whereby all of the functions or You may make it implement | achieve a part.

１０音量制御装置
１１正規化部
１２ＭＤＣＴ部（フィルタバンク）
１３ＦＦＴ部
１４音量解析部
１５音量制御部
１６エンコード後処理部
２０ＡＡＣエンコーダ
３０音量制御装置
３１デコード前処理部
３２音量制御部
３３ＩＭＤＣＴ部
３４ＦＦＴ部
３５音量解析部
４０ＡＡＣデコーダ
５０音量制御装置
５１デコード前処理部
５２正規化部
５３ＩＭＤＣＴ部
５４音量解析部
５５音量制御部
６０ＡＡＣデコーダ 10 Volume control device 11 Normalization unit 12 MDCT unit (filter bank)
DESCRIPTION OF SYMBOLS 13 FFT part 14 Volume analysis part 15 Volume control part 16 Encoding post-processing part 20 AAC encoder 30 Volume control apparatus 31 Decoding pre-processing part 32 Volume control part 33 IMDCT part 34 FFT part 35 Volume analysis part 40 AAC decoder 50 Volume control apparatus 51 Decoding preprocessing unit 52 Normalization unit 53 IMDCT unit 54 Volume analysis unit 55 Volume control unit 60 AAC decoder

Claims

A volume analysis unit for analyzing the volume of an input acoustic signal on a time domain supplied to an encoder having a filter bank for each of a plurality of divided bands;
A volume control unit that controls a volume of an acoustic signal on a frequency domain output from the filter bank based on the input acoustic signal for each of the divided bands based on an analysis result of the volume analysis unit. Volume control device.

A normalization unit for normalizing a signal level of the input acoustic signal;
The output acoustic signal of the filter bank is generated from the normalized input acoustic signal,
The volume control unit controls the volume of the output acoustic signal of the filter bank for each divided band based on the normalization content in the normalization unit and the analysis result of the volume analysis unit. Item 2. The volume control device according to Item 1.

A volume control unit for controlling the volume of an input acoustic signal on a frequency domain supplied to a decoder having a filter bank;
A volume analysis unit that analyzes the volume of the acoustic signal in the time domain output from the filter bank based on the input acoustic signal after the volume control by the volume control unit, for each of a plurality of divided bands; ,
The volume control unit is configured to control the volume of the input acoustic signal for each of the divided bands based on the analysis result of the volume analysis unit.

A normalization unit for normalizing a signal level of an input acoustic signal on a frequency domain supplied to a decoder having a filter bank;
A volume analysis unit for analyzing the volume of the acoustic signal in the time domain output from the filter bank based on the input acoustic signal after normalization by the normalization unit;
A volume control apparatus comprising: a volume control unit that controls a volume of an output acoustic signal of the filter bank based on a normalization content in the normalization unit and an analysis result of the volume analysis unit.

An electronic apparatus comprising the volume control device according to claim 1.