JP2006293400A

JP2006293400A - Encoding device and decoding device

Info

Publication number: JP2006293400A
Application number: JP2006191278A
Authority: JP
Inventors: Mineo Tsushima; 峰生津島; Takeshi Norimatsu; 武志則松; Kosuke Nishio; 孝祐西尾; Naoya Tanaka; 直也田中
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2001-11-14
Filing date: 2006-07-12
Publication date: 2006-10-26
Anticipated expiration: 2022-11-05
Also published as: JP4308229B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide an encoding device and a decoding device capable of performing encoding with high compressibility and decoding wide-band frequency spectra. <P>SOLUTION: The encoding device (200) includes an MDCT unit (202) that transforms an input signal in a time domain into a frequency spectrum, a BWE encoding unit (204) that generates extension data which prescribes a higher frequency spectrum at a higher frequency than a lower frequency spectrum included in the converted frequency spectrum by reference to the low frequency spectrum, and an encoded data stream generating unit (205) that encodes to output the lower frequency spectrum and the extension information. The BWE encoding unit (204) generates as the extension information a first parameter which prescribes a lower subband which is to be copied as the higher frequency spectrum from among a plurality of the lower subbands which form the lower frequency spectrum obtained by the MDCT unit (202) and a second parameter which prescribes a gain of the lower subband after being copied. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、音声信号や音楽信号などのオーディオ信号に対して、直交変換等の手法を用いて、時間領域から周波数領域に変換した信号を、より少ない符号化列で符号化することで情報圧縮する符号化装置と、符号化列を入力として情報を伸長する復号化装置に関する。 The present invention compresses an audio signal such as an audio signal or a music signal by encoding a signal, which is converted from the time domain to the frequency domain, using a method such as orthogonal transformation with a smaller number of encoded sequences. The present invention relates to an encoding device that performs decoding, and a decoding device that decompresses information using an encoded sequence as an input.

オーディオ信号の符号化方法、および、復号化方法は現在までに非常に多くの方式が開発されている。特に昨今では、それらの中でもＩＳＯ／ＩＥＣで国際標準化されたＩＳ１３８１８−７が認知され、高音質で高効率な符号化方法として、評価されている。この符号化方式はＡＡＣと呼ばれている。近年、前記ＡＡＣがＭＰＥＧ４と呼ばれる標準化にも採用され、前記ＩＳ１３８１８−７に対して、いくつかの拡張機能を具備したＭＰＥＧ４−ＡＡＣと呼ばれる方式が策定されている。符号化過程の一例として、INFORMATIVE PARTにその記述がある。 To date, a great number of audio signal encoding and decoding methods have been developed. Particularly in recent years, among them, IS13818-7 internationally standardized by ISO / IEC has been recognized and is evaluated as a high-quality and high-efficiency encoding method. This encoding method is called AAC. In recent years, the AAC has also been adopted for standardization called MPEG4, and a method called MPEG4-AAC having several extended functions has been formulated for the IS13818-7. An example of the encoding process is described in INFORMATIVE PART.

ここで図１３を用いて、従来の符号化方法を用いたオーディオ符号化装置について説明する。図１３は、従来の符号化装置１００の構成を示すブロック図である。この符号化装置１００は、スペクトル増幅部１０１、スペクトル量子化部１０２、ハフマン符号化部１０３、符号化列転送部１０４を含んで構成される。アナログオーディオ信号を所定の周波数でサンプリングすることによって得られた時間軸上のオーディオ離散信号列は、一定時間間隔で一定サンプル数ずつに切り出され、図示しない時間周波数変換部を経て、周波数軸上のデータに変換された後、符号化装置１００の入力信号としてスペクトル増幅部１０１に与えられる。スペクトル増幅部１０１は、あらかじめ決められた帯域ごとにある１つのゲインをもって、前記帯域に含まれるスペクトルを増幅する。スペクトル量子化部１０２は、前出の増幅されたスペクトルを決められた変換式で量子化をおこなう。ＡＡＣ方式の場合は、浮動小数で表現されている周波数スペクトル情報を整数値に丸めをおこなうことで量子化をおこなっている。ハフマン符号化部１０３は、前記量子化されたスペクトル情報を何個かずつまとめてハフマン符号化した上、スペクトル増幅部１０１における前記所定帯域ごとのゲインおよび量子化の変換式を特定する情報などをハフマン符号化し、その符号を符号化転送部１０４に送る。ハフマン符号化された符号化列は、符号化列転送部１０４から伝送路または記録媒体などを介して復号化装置に転送され、復号化装置によって時間軸上のオーディオ信号に再生される。従来の符号化装置はこのようにして動作する。 Here, an audio encoding apparatus using a conventional encoding method will be described with reference to FIG. FIG. 13 is a block diagram showing a configuration of a conventional encoding apparatus 100. As shown in FIG. The encoding apparatus 100 includes a spectrum amplification unit 101, a spectrum quantization unit 102, a Huffman encoding unit 103, and an encoded sequence transfer unit 104. An audio discrete signal sequence on the time axis obtained by sampling an analog audio signal at a predetermined frequency is cut out by a predetermined number of samples at a constant time interval, and after passing through a time frequency converter (not shown), After being converted to data, it is given to the spectrum amplifying unit 101 as an input signal of the encoding apparatus 100. The spectrum amplifying unit 101 amplifies the spectrum included in the band with one gain for each predetermined band. The spectrum quantization unit 102 quantizes the above amplified spectrum with a determined conversion formula. In the case of the AAC method, quantization is performed by rounding frequency spectrum information expressed by a floating-point number to an integer value. The Huffman coding unit 103 collects several pieces of the quantized spectrum information and performs Huffman coding, and then specifies information for specifying the gain and quantization conversion formula for each predetermined band in the spectrum amplification unit 101. Huffman encoding is performed, and the code is sent to the encoding transfer unit 104. The Huffman-encoded encoded sequence is transferred from the encoded sequence transfer unit 104 to a decoding device via a transmission path or a recording medium, and is reproduced as an audio signal on the time axis by the decoding device. The conventional encoding device operates in this way.

しかしながら、上記従来の符号化装置１００では、情報量の圧縮能力がハフマン符号化部１０３などの性能に委ねられており、高い圧縮率で、つまり、少ない情報量で符号化を行う際には、前記スペクトル増幅部１０１で十分にゲインを小さくし、前記スペクトル量子化部１０２で得られる量子化スペクトル列が前記ハフマン符号化部１０３で少ない情報量となるように符号化する必要がある。このような方法に従って、少ない情報量となるように符号化を行うと、再生される音声および音楽の周波数帯域が狭くなってしまう。このため、聴感上こもった感じが否めず、十分な音質が確保できないという問題が生じる。 However, in the conventional encoding device 100, the compression capability of the information amount is left to the performance of the Huffman encoding unit 103 and the like, and when encoding with a high compression rate, that is, with a small amount of information, The spectrum amplifying unit 101 needs to reduce the gain sufficiently, and the Huffman encoding unit 103 needs to perform encoding so that the quantized spectrum sequence obtained by the spectrum quantization unit 102 has a small amount of information. If encoding is performed so that the amount of information is small according to such a method, the frequency band of the reproduced speech and music is narrowed. For this reason, there is a problem that a feeling of hearing is unavoidable and sufficient sound quality cannot be secured.

本発明は、上記課題に鑑み、符号化装置ではオーディオ信号を高い圧縮率で符号化し、復号化装置では広帯域な周波数スペクトル情報を復号化できる符号化装置および復号化装置を提供することを目的とする。 In view of the above problems, an object of the present invention is to provide an encoding device and a decoding device capable of encoding an audio signal with a high compression rate in the encoding device and decoding wideband frequency spectrum information in the decoding device. To do.

上記課題を解決するために、本発明の符号化装置は、入力信号を符号化する装置であって、時間軸上の入力信号を周波数スペクトルに変換する時間周波数変換手段と、変換された前記周波数スペクトルに含まれる第１周波数スペクトルを参照することで、当該第１周波数スペクトルよりも高い周波数における第２周波数スペクトルを特定する拡張情報を生成する帯域拡張手段と、前記時間周波数変換手段で得られた第１周波数スペクトルと前記帯域拡張手段で得られた拡張情報とを符号化して出力する符号化手段とを備え、前記帯域拡張手段は、前記時間周波数変換手段で得られた第１周波数スペクトルを構成する複数の部分スペクトルの中から前記第２周波数スペクトルとして複製する元となる部分スペクトルを特定する第１パラメータと、複製後における部分スペクトルのゲインを特定する第２パラメータとを、前記拡張情報として生成することを特徴とする。 In order to solve the above-described problems, an encoding apparatus according to the present invention is an apparatus that encodes an input signal, a time-frequency conversion unit that converts an input signal on a time axis into a frequency spectrum, and the converted frequency. By referring to the first frequency spectrum included in the spectrum, the band extension means for generating extension information for specifying the second frequency spectrum at a frequency higher than the first frequency spectrum, and the time frequency conversion means obtained Encoding means for encoding and outputting the first frequency spectrum and the extension information obtained by the band extension means, wherein the band extension means constitutes the first frequency spectrum obtained by the time frequency conversion means A first parameter for specifying a partial spectrum to be copied as the second frequency spectrum from among the plurality of partial spectra And a second parameter specifying the gain of the partial spectrum after replication, and generates as the extension information.

また、本発明の復号化装置は、符号化信号を復号化する装置であって、前記符号化信号には、第１周波数スペクトルと、当該第１周波数スペクトルよりも高い周波数における第２周波数スペクトルを特定する第１及び第２パラメータを含む拡張情報とが含まれ、前記復号化装置は、前記符号化信号を復号化することによって前記第１周波数スペクトルと前記拡張情報とを生成する復号化手段と、前記第１周波数スペクトルと前記第１及び第２パラメータとから前記第２周波数スペクトルを生成する帯域拡張手段と、生成された第２周波数スペクトルと前記第１周波数スペクトルとを合成して得られる周波数スペクトルを時間軸上の信号に変換する周波数時間変換手段とを備え、前記帯域拡張手段は、前記第１周波数スペクトルを構成する複数の部分スペクトルのうち前記第１パラメータによって特定される部分スペクトルを複製し、複製後における部分スペクトルのゲインを前記第２パラメータによって決定し、得られた部分スペクトルを前記第２周波数スペクトルとして生成することを特徴とする。 The decoding device of the present invention is a device for decoding an encoded signal, and the encoded signal includes a first frequency spectrum and a second frequency spectrum at a frequency higher than the first frequency spectrum. Decoding information for generating the first frequency spectrum and the extension information by decoding the encoded signal; and decoding means for decoding the encoded signal; , A frequency obtained by synthesizing the generated second frequency spectrum and the first frequency spectrum, band expanding means for generating the second frequency spectrum from the first frequency spectrum and the first and second parameters. Frequency time conversion means for converting a spectrum into a signal on a time axis, and the band extending means includes a plurality of components constituting the first frequency spectrum. Replicating a partial spectrum specified by the first parameter of the partial spectrum, determining a gain of the partial spectrum after the replication by the second parameter, and generating the obtained partial spectrum as the second frequency spectrum; Features.

以上のように、本発明の符号化装置によれば、低いビットレートで、広帯域なオーディオ符号化列を提供することが可能となる。本発明の符号化装置は、低域周波数成分は、その周波数の微細構造をハフマン符号化などの圧縮技術を用いて符号化するが、高域周波数成分は、その微細構造を符号化せず、主に低域スペクトルを高域スペクトルとして代替複製する情報だけを符号化しているので、高域周波数成分を表す符号化列によって消費される情報量を極小化することができるという効果がある。 As described above, according to the encoding apparatus of the present invention, it is possible to provide a wide-band audio encoded sequence at a low bit rate. In the encoding device of the present invention, the low frequency component encodes the fine structure of the frequency using a compression technique such as Huffman coding, but the high frequency component does not encode the fine structure, Since only the information that substitutes and replicates the low-frequency spectrum as the high-frequency spectrum is mainly encoded, there is an effect that the amount of information consumed by the encoded sequence representing the high-frequency component can be minimized.

従って、本発明の復号化装置によれば、復号化の過程では、高域周波数成分を、低域周波数成分の複製にゲイン調整などの加工を加えて生成するので、データ量の少ない符号化列から広帯域な再生音を得ることができるという効果がある。 Therefore, according to the decoding apparatus of the present invention, in the decoding process, the high frequency component is generated by applying a process such as gain adjustment to the duplication of the low frequency component, so that an encoded sequence with a small amount of data is generated. Therefore, there is an effect that a wide reproduction sound can be obtained.

また、前記帯域拡張手段は、生成した前記第２周波数スペクトルにノイズスペクトルを加算し、前記周波数時間変換手段は、前記ノイズスペクトルが加算された第２周波数スペクトルと前記第１周波数スペクトルとを合成して得られる周波数スペクトルを時間軸上の信号に変換するとしてもよい。 The band extending unit adds a noise spectrum to the generated second frequency spectrum, and the frequency time converting unit synthesizes the second frequency spectrum to which the noise spectrum is added and the first frequency spectrum. The frequency spectrum obtained in this way may be converted into a signal on the time axis.

従って、本発明の復号化装置によれば、前記第２周波数スペクトルにノイズスペクトルを加算して、複製された低域周波数成分にゲイン調整を施すので、前記第２周波数スペクトルのトナリティーを極端に上げることなく、広帯域化を図ることができるという効果がある。 Therefore, according to the decoding apparatus of the present invention, the noise spectrum is added to the second frequency spectrum, and gain adjustment is performed on the copied low frequency component, so that the tonality of the second frequency spectrum is extremely reduced. There is an effect that it is possible to increase the bandwidth without increasing the frequency.

以下、本発明の実施の形態における符号化装置および復号化装置について図面（図１〜図１２）を用いて説明する。 Hereinafter, an encoding device and a decoding device according to an embodiment of the present invention will be described with reference to the drawings (FIGS. 1 to 12).

（実施の形態１）
まず、符号化装置について説明する。図１は、本発明の実施の形態１における符号化装置２００の構成を示すブロック図である。符号化装置２００は、低域部スペクトルを一定周波数幅のサブバンドに分割し、高域部に複写されるべきサブバンドを特定するための情報を音響符号化ビットストリームに含めて出力する符号化装置であって、プリプロセス部２０１、ＭＤＣＴ部２０２、量子化部２０３、ＢＷＥエンコード部２０４および符号化列生成部２０５を備える。 (Embodiment 1)
First, the encoding device will be described. FIG. 1 is a block diagram showing a configuration of coding apparatus 200 according to Embodiment 1 of the present invention. Encoding apparatus 200 divides a low-frequency part spectrum into subbands having a constant frequency width, and includes information for specifying a subband to be copied to a high-frequency part in an acoustic encoded bitstream for output. The apparatus includes a preprocessing unit 201, an MDCT unit 202, a quantization unit 203, a BWE encoding unit 204, and a coded sequence generation unit 205.

プリプロセス部２０１は、入力されたオーディオ信号列が、符号化復号化に伴う量子化による量子化歪により音質が変化することを考慮して、時間分解能を優先して２０４８サンプルよりもさらに細かなフレーム単位（ＳＨＯＲＴ窓）での量子化を行った方がよいか、２０４８サンプルサイズ（ＬＯＮＧ窓）のまま量子化を行った方がよいかの判定を行う。 In consideration of the fact that the audio quality of the input audio signal sequence changes due to quantization distortion due to quantization accompanying encoding and decoding, the preprocessing unit 201 gives priority to temporal resolution and is finer than 2048 samples. It is determined whether it is better to perform quantization on a frame basis (SHORT window) or to perform quantization with a 2048 sample size (LONG window).

ＭＤＣＴ部２０２は、プリプロセス部２０１の出力である時間軸上のオーディオ離散信号列を変形離散余弦変換（ＭＤＣＴ変換：Modified Discrete Cosine Transform）して、周波数軸上の周波数スペクトルを出力する。量子化部２０３は、ＭＤＣＴ部２０２から出力された周波数スペクトルの低域部を量子化しハフマン符号化して出力する。 The MDCT unit 202 performs a modified discrete cosine transform (MDCT transform) on the time axis audio discrete signal sequence that is the output of the preprocessing unit 201, and outputs a frequency spectrum on the frequency axis. The quantization unit 203 quantizes the low frequency part of the frequency spectrum output from the MDCT unit 202, encodes it to Huffman, and outputs the result.

ＢＷＥエンコード部２０４は、ＭＤＣＴ部２０２で得られたＭＤＣＴ係数を入力とし、入力されたうちの低域部スペクトルを一定周波数幅のサブバンドに区切り、ＭＤＣＴ部２０２から出力された周波数スペクトルの高域部に基づいて、高域部スペクトルの代わりに高域部に複写されるべき低域部サブバンドを特定する。 The BWE encoding unit 204 receives the MDCT coefficient obtained by the MDCT unit 202 as an input, divides the input low-frequency part spectrum into subbands having a certain frequency width, and outputs the high frequency spectrum output from the MDCT part 202 Based on the part, the low band sub-band to be copied to the high band instead of the high band spectrum is specified.

ＢＷＥエンコード部２０４は、特定された低域部サブバンドを示す拡張周波数スペクトル情報を高域部サブバンドごとに生成して、必要であれば生成された拡張周波数スペクトル情報を量子化し、ハフマン符号化して拡張オーディオ符号化列を出力する。符号化列生成部２０５は、量子化部２０３からの出力である低域部オーディオ符号化列と、ＢＷＥエンコード部２０４からの出力である拡張オーディオ符号化列とを、それぞれ、ＡＡＣの規格により定められた音響符号化ストリームのオーディオ符号化列部と拡張オーディオ符号化列部とに記録して外部に出力する。 The BWE encoding unit 204 generates extended frequency spectrum information indicating the specified low frequency band subband for each high frequency band subband, quantizes the generated extended frequency spectrum information, if necessary, and performs Huffman coding To output an extended audio encoded sequence. The encoded sequence generation unit 205 determines the low band audio encoded sequence that is output from the quantizing unit 203 and the extended audio encoded sequence that is output from the BWE encoding unit 204, respectively, according to the AAC standard. The audio encoded stream portion and the extended audio encoded sequence portion of the obtained acoustic encoded stream are recorded and output to the outside.

以下では、上記のように構成された符号化装置２００の動作について説明する。まず、プリプロセス部２０１に、例えば、４４．１ｋＨｚのサンプリング周波数でサンプリングされたオーディオ離散信号列が、フレーム単位で２０４８サンプルずつ入力される。１フレームのオーディオ信号列は、２０４８サンプルに限るものではないが、後述の符号化装置の説明を容易にするために、２０４８サンプルの場合について言及する。プリプロセス部２０１は、入力されたオーディオ信号列に基づいて、この入力オーディオ信号列をＬＯＮＧ窓で符号化するか、ＳＨＯＲＴ窓で符号化するかを判定する。以下では、プリプロセス部２０１において、ＬＯＮＧ窓で量子化を行うと判定された場合について述べる。 Hereinafter, the operation of the encoding apparatus 200 configured as described above will be described. First, for example, an audio discrete signal sequence sampled at a sampling frequency of 44.1 kHz is input to the preprocessing unit 201 by 2048 samples per frame. The audio signal sequence of one frame is not limited to 2048 samples, but in order to facilitate the description of the encoding apparatus described later, the case of 2048 samples will be mentioned. Based on the input audio signal sequence, the preprocessing unit 201 determines whether to encode the input audio signal sequence using the LONG window or the SHORT window. Hereinafter, a case will be described in which the preprocessing unit 201 determines to perform quantization in the LONG window.

プリプロセス部２０１から出力されたオーディオ離散信号列は、ＭＤＣＴ部２０２の時間周波数変換によって、ある時間間隔毎に時間軸上の離散信号から周波数スペクトル情報に変換され出力される。時間周波数変換としては、ＭＤＣＴ変換が一般的である。時間間隔としては、一般に１２８、２５６、５１２、１０２４、２０４８サンプル毎のいずれかが用いられる。ＭＤＣＴ変換の場合は、時間軸上の離散信号と、変換後の周波数スペクトル情報のサンプル数とを同数にして扱うことができる。ＭＤＣＴ変換は当業者には明らかな技術である。ここでは、プリプロセス部２０１から出力される２０４８サンプルのオーディオ信号は、ＭＤＣＴ部２０２に入力され、ＭＤＣＴ変換がなされるものとする。また、ＭＤＣＴ部２０２は、過去フレーム（２０４８サンプル）と新たに入力されたフレーム（２０４８サンプル）とを用いてＭＤＣＴ変換を行い、２０４８サンプルのＭＤＣＴ係数を出力する。ＭＤＣＴ変換は、一般式として（数１）などで与えられる。 The audio discrete signal sequence output from the pre-processing unit 201 is converted from a discrete signal on the time axis into frequency spectrum information and output at certain time intervals by time frequency conversion of the MDCT unit 202. MDCT conversion is generally used as time frequency conversion. Generally, any one of 128, 256, 512, 1024, and 2048 samples is used as the time interval. In the case of MDCT conversion, the number of discrete signals on the time axis and the number of samples of frequency spectrum information after conversion can be handled in the same number. MDCT conversion is a technique apparent to those skilled in the art. Here, it is assumed that an audio signal of 2048 samples output from the preprocessing unit 201 is input to the MDCT unit 202 and subjected to MDCT conversion. Also, the MDCT unit 202 performs MDCT conversion using the past frame (2048 samples) and the newly input frame (2048 samples), and outputs 2048 sample MDCT coefficients. The MDCT conversion is given by a general formula (Formula 1) or the like.

一般に符号化の過程では、上記のように得られた周波数スペクトル情報を完全に可逆、もしくは情報圧縮に相当するハフマン符号のような非可逆な符号で表現し、符号化列を生成する。ここでは、量子化部２０３には、低域成分から高域成分へと周波数の順に並んだ２０４８サンプルのＭＤＣＴ係数のうち、低域側半分の０番目から１０２３番目までの低域部ＭＤＣＴ係数が入力される。量子化部２０３は、入力されたＭＤＣＴ係数をＡＡＣ方式などの量子化方法を用いて量子化し、低域部オーディオ符号化列を生成する。一般にＡＡＣ方式などの量子化方法では、量子化されるべきＭＤＣＴ係数の数は規定されていない。従って、量子化部２０３は、入力される低域部ＭＤＣＴ係数（１０２４係数）の全てを量子化してもよいし、一部のみを量子化してもよい。 In general, in the process of encoding, the frequency spectrum information obtained as described above is completely reversible or expressed by an irreversible code such as a Huffman code corresponding to information compression to generate a coded sequence. Here, the quantizing unit 203 has low-frequency MDCT coefficients from 0th to 1023th in the low-frequency half of 2048 samples of MDCT coefficients arranged in order of frequency from low-frequency components to high-frequency components. Entered. The quantization unit 203 quantizes the input MDCT coefficient using a quantization method such as an AAC method, and generates a low-frequency audio encoded sequence. In general, in the quantization method such as the AAC method, the number of MDCT coefficients to be quantized is not defined. Therefore, the quantization unit 203 may quantize all of the input low frequency part MDCT coefficients (1024 coefficients) or only a part of them.

ここでは、量子化部２０３は、ＭＤＣＴ係数のうち、０番目から（maxline−１）番目までの計（maxline）個の係数を量子化し符号化する。ただし、maxline は、従来の符号化装置によって量子化および符号化されるＭＤＣＴ係数の上限周波数である。一方、ＢＷＥエンコード部２０４には、ＭＤＣＴ部２０２から出力されたすべてのＭＤＣＴ係数（２０４８係数）が入力される。 Here, the quantization unit 203 quantizes and encodes a total of (maxline) coefficients from the 0th to the (maxline-1) th among the MDCT coefficients. Here, maxline is the upper limit frequency of the MDCT coefficient quantized and encoded by the conventional encoding device. On the other hand, all MDCT coefficients (2048 coefficients) output from the MDCT section 202 are input to the BWE encoding section 204.

以下、図１に示したＢＷＥエンコード部２０４における拡張オーディオ符号化列の生成処理について図２（ａ）〜図２（ｃ）を用いてさらに詳細に説明する。図２（ａ）は、ＭＤＣＴ部２０２によって出力されるＭＤＣＴ係数列を示す図である。図２（ｂ）は、図２（ａ）に示したＭＤＣＴ係数のうち、量子化部２０３で符号化される０番目から（maxline−１）番目までのＭＤＣＴ係数を示す図である。図２（ｃ）は、図１に示したＢＷＥエンコード部２０４における拡張オーディオ符号化列の生成方法の一例を示す図である。 Hereinafter, the extended audio coded sequence generation processing in the BWE encoding unit 204 shown in FIG. 1 will be described in more detail with reference to FIGS. 2 (a) to 2 (c). FIG. 2A is a diagram illustrating an MDCT coefficient sequence output by the MDCT unit 202. FIG. 2B is a diagram illustrating the 0th to (maxline-1) -th MDCT coefficients encoded by the quantization unit 203 among the MDCT coefficients illustrated in FIG. FIG. 2C is a diagram illustrating an example of a method for generating an extended audio encoded sequence in the BWE encoding unit 204 illustrated in FIG.

なお、図２（ａ）〜図２（ｃ）において、横軸は周波数を示し、各ＭＤＣＴ係数の番号が低域から高域へ順に０番目から２０４７番目まで付されている。縦軸はＭＤＣＴ係数の値を表している。また、同図において、周波数スペクトルを周波数方向に連続する波形で示しているが、実際には、連続した波形ではない。図２（ａ）に示すように、ＭＤＣＴ部２０２から出力される２０４８個のＭＤＣＴ係数は、一定時間サンプリングされた原音を、最大帯域幅では、サンプリング周波数の半分の周波数帯域で表すことができる。 2A to 2C, the horizontal axis indicates the frequency, and the numbers of the MDCT coefficients are assigned from the 0th to the 2047th in order from the low frequency to the high frequency. The vertical axis represents the value of the MDCT coefficient. In addition, in the figure, the frequency spectrum is shown as a waveform that is continuous in the frequency direction, but it is not actually a continuous waveform. As shown in FIG. 2A, the 2048 MDCT coefficients output from the MDCT unit 202 can represent an original sound sampled for a certain period of time in a frequency band that is half the sampling frequency in the maximum bandwidth.

一般に従来の符号化装置では、図２（ａ）に示したＭＤＣＴ係数のうち、聴覚的に重要な、例えば、maxline までの低域部ＭＤＣＴ係数のみが量子化および符号化されて、復号化装置に伝送される場合が多い。このため、ＢＷＥエンコード部２０４では、maxline 以上の高域部を、図２（ａ）に示したＭＤＣＴ係数そのものではなく、高域部ＭＤＣＴ係数に代わって高域部ＭＤＣＴ係数を表す拡張周波数スペクトル情報を生成する。すなわち、ＢＷＥエンコード部２０４では、ＭＤＣＴ係数のうち、０番目から（maxline−１）番目までは、量子化部２０３で予め符号化されるので、図２（ｃ）に示したように、（maxline）番目から（targetline−１）番目までのＭＤＣＴ係数を符号化することを目的としている。 In general, in the conventional coding apparatus, only the low-frequency MDCT coefficients up to, for example, maxline, among the MDCT coefficients shown in FIG. Often transmitted to For this reason, in the BWE encoding unit 204, the extended frequency spectrum information that represents the high frequency part above the maxline is not the MDCT coefficient itself shown in FIG. 2A, but represents the high frequency part MDCT coefficient instead of the high frequency part MDCT coefficient. Is generated. That is, in the BWE encoding unit 204, the 0th to (maxline-1) th of the MDCT coefficients are encoded in advance by the quantization unit 203. Therefore, as shown in FIG. The purpose is to encode MDCT coefficients from the () th to (targetline-1) th.

まず、ＢＷＥエンコード部２０４は、復号化装置においてオーディオ信号として再生させたい高域部の範囲（具体的には、maxline から targetline までの周波数範囲）を想定し、想定した範囲を一定周波数間隔のサブバンドに区切る。さらに、ＢＷＥエンコード部２０４は、入力されたＭＤＣＴ係数のうち、０番目から（maxline−１）番目までのＭＤＣＴ係数からなる低域部の一部または全部を、高域サブバンドと同じ周波数幅の等間隔サブバンドに区切り、（maxline）番目から２０４７番目までのＭＤＣＴ係数からなる高域部において、各サブバンドに代替しうる低域部サブバンドを特定する。高域部各サブバンドに対して代替しうる低域部サブバンドとして、例えば、高域部サブバンドと低域部サブバンドとのエネルギー差が最小となる低域部サブバンドが特定される。または、高域部と低域部との各サブバンド内で、絶対値が最大となるＭＤＣＴ係数の周波数軸上の位置が最も近い低域部サブバンドが特定されるとしてもよい。 First, the BWE encoding unit 204 assumes a high-frequency range (specifically, a frequency range from maxline to targetline) to be reproduced as an audio signal in the decoding apparatus, and the assumed range is sub-frequency constant. Divide into bands. Further, the BWE encoding unit 204 has a part or all of the low-frequency part composed of the MDCT coefficients from the 0th to the (maxline-1) -th among the inputted MDCT coefficients having the same frequency width as that of the high-frequency subband. In the high frequency band consisting of the (maxline) th to 2047th MDCT coefficients, the low frequency band subband that can be substituted for each subband is specified. For example, a low-frequency subband that minimizes an energy difference between the high-frequency subband and the low-frequency subband is identified as a low-frequency subband that can be substituted for each high-frequency subband. Alternatively, the low frequency subband having the closest position on the frequency axis of the MDCT coefficient having the maximum absolute value may be specified in each subband of the high frequency region and the low frequency region.

図２（ｃ）のＢＷＥエンコード部２０４の場合、ＭＤＣＴ係数の番号を表すstartline，targetline，endline，sbwの間には、（数２）の関係があるとする。 In the case of the BWE encoding unit 204 in FIG. 2C, it is assumed that there is a relationship of (Equation 2) among startline, targetline, endline, and sbw representing the numbers of MDCT coefficients.

ここで、shiftlenは予め設定された値でもよいし、入力されるＭＤＣＴ係数の変化に応じてshiftlenを計算し、その値を示す情報をＢＷＥエンコード部２０４で符号化してもよい。 Here, the shiftlen may be a preset value, or the shiftlen may be calculated according to a change in the input MDCT coefficient, and information indicating the value may be encoded by the BWE encoding unit 204.

図２（ｃ）では、高域部を、ＭＤＣＴ係数sbw個のサンプルからなる周波数幅で、８つのサブバンドＭＤＣＴ係数列ｈ０〜ｈ７に区分した場合、低域部では、startlineからendlineまでに、sbw個のサンプルからなるサブバンドＭＤＣＴ係数サブバンドを４つ構成でき、各々をＡ、Ｂ、Ｃ、Ｄとした例を示している。なお、ここでは便宜上、startlineからendlineまでを４つのサブバンドに、maxlineからtargetlineを８つのサブバンドに分けるとしたが、これらの数や１サブバンドあたりのサンプル数は必ずしもこれらの値に限らない。ＢＷＥエンコード部２０４では、周波数幅sbwを有する各高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数列を代替する、同じ周波数幅sbwを有する低域サブバンドＡ、Ｂ，Ｃ，Ｄを特定し、特定された低域代替サブバンドを示す拡張周波数スペクトル情報を生成し、符号化する。ここで代替とは、得られるＭＤＣＴ係数の一部、この場合は低域サブバンドＡ〜ＤのＭＤＣＴ係数を、高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数としてコピーすることをいう。また、代替の意味には、さらに前記代替されたＭＤＣＴ係数に対してゲインを制御することも含むものとする。 In FIG. 2 (c), when the high-frequency part is divided into eight subband MDCT coefficient sequences h0 to h7 with a frequency width composed of sbw MDCT coefficients, in the low-frequency part, from startline to endline, In the example, four subband MDCT coefficient subbands each consisting of sbw samples can be configured, and each of them is A, B, C, and D. Here, for convenience, the startline to the endline are divided into four subbands, and the maxline to the targetline are divided into eight subbands. However, these numbers and the number of samples per subband are not necessarily limited to these values. . The BWE encoding unit 204 specifies and specifies the low frequency subbands A, B, C, and D having the same frequency width sbw that replace the MDCT coefficient sequences of the high frequency subbands h0 to h7 having the frequency width sbw. Extended frequency spectrum information indicating the low frequency alternative subband is generated and encoded. Here, “replacement” refers to copying a part of the obtained MDCT coefficients, in this case, MDCT coefficients of the low-frequency subbands A to D as MDCT coefficients of the high-frequency subbands h0 to h7. The alternative meaning further includes controlling the gain with respect to the alternative MDCT coefficient.

上記ＢＷＥエンコード２０４の場合、いずれの低域部サブバンドで高域部サブバンドを代替するかを表現するのに必要な情報量は、高域サブバンドｈ０〜ｈ７の１つあたり、高々２ビットである。それは、高域サブバンド１つあたり、低域部サブバンドＡ〜Ｄの４種類から１つを特定できればよいためである。この様にしてＢＷＥエンコード部２０４では、高域サブバンドｈ０〜ｈ７が、低域サブバンドＡ〜Ｄのいずれで代替されるかを示す拡張周波数スペクトル情報を符号化し、その符号列をもって拡張オーディオ符号化列を生成する。 In the case of the BWE encoding 204, the amount of information necessary to express which low-frequency subband is substituted for the high-frequency subband is at most 2 bits for each of the high-frequency subbands h0 to h7. It is. This is because it is only necessary to identify one of the four types of low-frequency subbands A to D per high-frequency subband. In this way, the BWE encoding unit 204 encodes extended frequency spectrum information indicating which of the high frequency subbands h0 to h7 is replaced by the low frequency subbands A to D, and uses the code string as an extended audio code. Generate a sequence.

さらにＢＷＥエンコード部２０４では、生成された拡張オーディオ符号化列の振幅調整を行う。図３（ａ）は、原音のＭＤＣＴ係数列を表す波形図である。図３（ｂ）は、ＢＷＥエンコード部２０４による代替によって生成されたＭＤＣＴ係数列を表す波形図である。図３（ｃ）は、図３（ｂ）に示したＭＤＣＴ係数列にゲイン制御を施した場合のＭＤＣＴ係数列を表す波形図である。図３（ａ）に示すように、ＢＷＥエンコード部２０４はmaxlineからtargetlineまでの高域部ＭＤＣＴ係数を複数の帯域に分割し、帯域毎のゲイン情報を符号化する。ゲイン情報の符号化のためのmaxlineからtargetlineまでの帯域の分割方法は、図２に示した高域部サブバンドｈ０〜ｈ７と同様の分割方法でもよいし、別の分割方法でもよい。ここでは、図２と同様の分割方法の場合について図３を用いて説明する。 Further, the BWE encoding unit 204 adjusts the amplitude of the generated extended audio encoded sequence. FIG. 3A is a waveform diagram showing an MDCT coefficient sequence of the original sound. FIG. 3B is a waveform diagram showing an MDCT coefficient sequence generated by substitution by the BWE encoding unit 204. FIG. 3C is a waveform diagram showing an MDCT coefficient sequence when gain control is performed on the MDCT coefficient sequence shown in FIG. As shown in FIG. 3A, the BWE encoding unit 204 divides the high-frequency MDCT coefficient from maxline to targetline into a plurality of bands, and encodes gain information for each band. The band division method from maxline to targetline for encoding gain information may be the same division method as that for the high frequency subbands h0 to h7 shown in FIG. 2, or may be another division method. Here, the case of the division method similar to FIG. 2 will be described with reference to FIG.

図３（ａ）に示すように、高域部サブバンドｈ０に含まれている原音のＭＤＣＴ係数を、x(0)，x(1)，...，x(sbw-1)とし、図３（ｂ）の高域部サブバンドｈ０における代替によるＭＤＣＴ係数を、r(0)，r(1)，...，r(sbw-1)とする。また図３（ｃ）におけるサブバンドｈ０のＭＤＣＴ係数をy(0)，y(1)，...，y(sbw-1)とし、配列x，r，yの間で、（数３）となるゲインg0を求め、それを符号化する。 As shown in FIG. 3A, the MDCT coefficients of the original sound included in the high frequency subband h0 are x (0), x (1),..., X (sbw-1), Assume that the MDCT coefficients by substitution in the high frequency subband h0 of 3 (b) are r (0), r (1), ..., r (sbw-1). Also, the MDCT coefficients of the subband h0 in FIG. 3C are y (0), y (1),..., Y (sbw-1), and between the arrays x, r, and y, (Equation 3) A gain g0 is obtained and encoded.

高域サブバンドｈ１〜ｈ７も同じように、ゲイン情報を算出し符号化する。これらゲイン情報ｇ０〜ｇ７も、拡張オーディオ符号化列に所定のビット数で符号化する。 Similarly, gain information is calculated and encoded for the high frequency subbands h1 to h7. These gain information g0 to g7 are also encoded into the extended audio encoded sequence with a predetermined number of bits.

このように符号化された拡張オーディオ符号化列は、図４に模式的に示すように、符号化装置２００の出力である音響符号化ビットストリーム中に記述される。図４（ａ）は、通常の音響符号化ビットストリームの一例を示す図である。図４（ｂ）は、本実施の形態の符号化装置２００によって出力される音響符号化ビットストリームの一例を示す図である。図４（ｃ）は、図４（ｂ）に示した拡張オーディオ符号化列部に記述される拡張オーディオ符号化列の一例を示す図である。図４（ａ）に示すように、音響符号化ビットストリームがストリーム１のように、フレーム毎に形成されている場合、符号化装置２００では、図４（ｂ）に示すストリーム２のように、各フレームの一部（例えば図中の斜線部）を拡張オーディオ符号化列部として使用する。 The extended audio encoded sequence encoded in this way is described in an acoustic encoded bit stream that is an output of the encoding apparatus 200, as schematically shown in FIG. FIG. 4A is a diagram illustrating an example of a normal audio encoded bitstream. FIG. 4B is a diagram illustrating an example of an acoustic coded bit stream output by the coding apparatus 200 according to the present embodiment. FIG. 4C is a diagram illustrating an example of the extended audio encoded sequence described in the extended audio encoded sequence unit illustrated in FIG. As shown in FIG. 4A, when the audio encoded bit stream is formed for each frame like the stream 1, the encoding apparatus 200 uses the stream 2 shown in FIG. A part of each frame (for example, the hatched portion in the figure) is used as the extended audio encoded sequence portion.

この拡張オーディオ符号化列部は、例えば、ＭＰＥＧ−２ＡＡＣおよびＭＰＥＧ−４ＡＡＣ記載のｄａｔａ＿ｓｔｒｅａｍ＿ｅｌｅｍｅｎｔの領域である。このｄａｔａ＿ｓｔｒｅａｍ＿ｅｌｅｍｅｎｔは、従来の符号化方式の機能を拡張した際に拡張用のデータを記述するための予備的な領域であって、この領域にどのようなデータが記録されていても、従来の復号化装置にはオーディオ符号化列とは認識されない領域である。また、例えば、オーディオ符号化列のデータ長を揃えるために「０」などの無意味なデータを充填する領域、例えば、ＭＰＥＧ−２ＡＡＣおよびＭＰＥＧ−４ＡＡＣでいうFill Elementなどの領域である。拡張オーディオ符号化列を音響符号化ビットストリーム中のこのような領域に記述しておけば、従来の復号化装置を用いて本発明の音響符号化ビットストリームを復号化した場合でも、拡張オーディオ符号化列をオーディオ信号として再生することによる雑音を生じることなく、従来と同様の帯域のオーディオ信号を再生することができる。 This extended audio coding sequence portion is, for example, a data_stream_element area described in MPEG-2 AAC and MPEG-4 AAC. This data_stream_element is a preliminary area for describing data for expansion when the function of the conventional coding method is expanded, and no matter what data is recorded in this area, This is a region that is not recognized as an audio encoded sequence by the encoding device. Further, for example, an area in which meaningless data such as “0” is filled in order to make the data length of the audio encoded sequence uniform, for example, an area such as a fill element in MPEG-2 AAC and MPEG-4 AAC. If the extended audio coded sequence is described in such an area in the acoustic coded bitstream, even if the acoustic coded bitstream of the present invention is decoded using a conventional decoding device, the extended audio code An audio signal having the same band as that of the prior art can be reproduced without causing noise due to reproduction of the sequence as an audio signal.

また、図４（ｃ）に示すように、拡張オーディオ符号化列は、直前フレームの拡張オーディオ符号化列と同じ方法で分割された低域サブバンドＡ〜Ｄを使用するか否かを示す項目と、各高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数を表す項目とが記述される。各高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数を表す項目には、それぞれ、特定された低域サブバンドＡ〜Ｄを示すデータと、そのゲイン情報とが記述される。直前フレームの拡張オーディオ符号化列と同じ低域サブバンドＡ〜Ｄを使用するか否かを示す項目には、例えば、直前フレームと同じ方法で区切られた低域サブバンドＡ〜Ｄの１つを使用して高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数を代替する場合には「１」、そうでない場合、すなわち、直前フレームとは異なる分割方法で新たに分割された低域サブバンドＡ〜Ｄの１つを使用して代替する場合には「０」で示される１ビットの値が記述される。 Also, as shown in FIG. 4C, an item indicating whether or not the extended audio coded sequence uses the low frequency subbands A to D divided by the same method as the extended audio coded sequence of the immediately preceding frame. And items representing the MDCT coefficients of the high frequency sub-bands h0 to h7. In the items representing the MDCT coefficients of the high frequency subbands h0 to h7, data indicating the specified low frequency subbands A to D and gain information thereof are described, respectively. The item indicating whether or not to use the same low frequency subbands A to D as the extended audio coded sequence of the immediately preceding frame includes, for example, one of the low frequency subbands A to D divided in the same manner as the immediately preceding frame. Is used to replace the MDCT coefficients of the high frequency subbands h0 to h7, otherwise, that is, that is, the low frequency subbands A to D newly divided by a different division method from the previous frame In the case of using one of these, a 1-bit value indicated by “0” is described.

低域サブバンドＡ〜Ｄのうち、特定された低域サブバンドを示す項目には、４つの低域サブバンドＡ〜Ｄの１つを特定する２ビットのデータが記述される。また、ゲイン情報は例えば、４ビットで記述される。このようにすれば、直前フレームと同じ方法で区切られた低域サブバンドＡ〜Ｄを使用して高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数を代替する場合、１フレームの高域部ＭＤＣＴ係数を、１＋８＊（２＋４）＝４９ビットの拡張オーディオ符号化列で表すことができる。また、直前フレームと同じ低域サブバンドＡ〜Ｄを使用するフレームでは、拡張オーディオ符号化列は、例えば、そのことを示す値「１」の１ビットだけで表すことができる。 Among the low frequency subbands A to D, 2-bit data specifying one of the four low frequency subbands A to D is described in the item indicating the specified low frequency subband. The gain information is described in 4 bits, for example. In this way, when substituting the MDCT coefficients of the high frequency subbands h0 to h7 using the low frequency subbands A to D divided in the same manner as the previous frame, the high frequency MDCT coefficient of one frame is changed. 1 + 8 * (2 + 4) = 49-bit extended audio coded sequence. Also, in a frame that uses the same low frequency subbands A to D as the previous frame, the extended audio coded sequence can be represented by only one bit of a value “1” indicating that, for example.

この様にして本発明の符号化装置２００によるオーディオ信号符号化方法が従来の符号化方式に適用された場合、データ量の少ない拡張オーディオ符号化列を用いて高域部を表現し、高音域の豊かな広帯域オーディオ再生音を得ることが可能となる。 In this way, when the audio signal encoding method by the encoding apparatus 200 of the present invention is applied to the conventional encoding method, the high frequency region is expressed using an extended audio encoded sequence with a small amount of data, Rich wideband audio playback sound can be obtained.

＜復号化装置＞
一方、復号化の過程では、入力されたオーディオ符号化列を復号化し、周波数スペクトル情報を得て、その周波数スペクトルを周波数時間変換することによって、時間軸上のオーディオ信号を再生する。 <Decryption device>
On the other hand, in the decoding process, the input audio coded sequence is decoded, frequency spectrum information is obtained, and the frequency spectrum is subjected to frequency time conversion to reproduce an audio signal on the time axis.

図５は、図１の符号化装置２００から出力された音響符号化ビットストリームを復号化する復号化装置６００の構成を示すブロック図である。復号化装置６００は、拡張オーディオ符号化列を含む音響符号化ビットストリームを復号化して、広帯域な周波数スペクトル情報を出力する復号化装置であって、符号化列分離部６０１、逆量子化部６０２、ＩＭＤＣＴ（Inversed Modified Discrete Cosine Transform）部６０３、ノイズ生成部６０４、ＢＷＥデコード部６０５および拡張ＩＭＤＣＴ部６０６を備える。 FIG. 5 is a block diagram illustrating a configuration of a decoding apparatus 600 that decodes the acoustic encoded bitstream output from the encoding apparatus 200 of FIG. The decoding apparatus 600 is a decoding apparatus that decodes an acoustic encoded bitstream including an extended audio encoded sequence and outputs wideband frequency spectrum information, and includes an encoded sequence separation unit 601 and an inverse quantization unit 602. , An IMDCT (Inversed Modified Discrete Cosine Transform) unit 603, a noise generation unit 604, a BWE decoding unit 605, and an extended IMDCT unit 606.

符号化列分離部６０１は、入力された音響符号化ビットストリームから、低域部を表すオーディオ符号化列と、高域部を表す拡張オーディオ符号化列とを分離し、分離されたオーディオ符号化列を逆量子化部６０２に、分離された拡張オーディオ符号化列をＢＷＥデコード部６０５に出力する。逆量子化部６０２は、音響符号化ビットストリームから分離されたオーディオ符号化列を逆量子化して、低域部ＭＤＣＴ係数を出力する。 The coded sequence separation unit 601 separates an audio coded sequence representing a low frequency portion and an extended audio coded sequence representing a high frequency portion from the input acoustic coded bitstream, and separates the encoded audio coding. The sequence is output to the inverse quantization unit 602, and the separated extended audio encoded sequence is output to the BWE decoding unit 605. The inverse quantization unit 602 inversely quantizes the audio coded sequence separated from the acoustic coded bitstream, and outputs a low frequency part MDCT coefficient.

なお、逆量子化部６０２は、オーディオ符号化列と拡張オーディオ符号化列の双方を入力としてもよい。また、逆量子化部６０２は、量子化部２０３での量子化方法としてＡＡＣ方式が使用されたのなら、ＡＡＣ方式の逆量子化を用いてＭＤＣＴ係数の復元を行う。これにより、逆量子化部６０２では、０番目から（maxline−１）番目までの低域部ＭＤＣＴ係数が復元され出力される。 Note that the inverse quantization unit 602 may receive both the audio encoded sequence and the extended audio encoded sequence as inputs. In addition, if the AAC method is used as the quantization method in the quantization unit 203, the inverse quantization unit 602 restores the MDCT coefficient using the AAC method inverse quantization. As a result, the inverse quantization unit 602 restores and outputs the low-frequency MDCT coefficients from the 0th to the (maxline-1) th.

ＩＭＤＣＴ部６０３は、逆量子化部６０２から出力された低域部ＭＤＣＴ係数を、ＩＭＤＣＴを用いて周波数時間変換を行い、時間軸上の低域部オーディオ信号を出力する。すなわち、逆量子化部６０２の出力をＩＭＤＣＴ部６０３の入力とした場合、１フレームあたり１０２４サンプルのオーディオ出力が得られる。ここでＩＭＤＣＴ部６０３は、１０２４サンプルのＩＭＤＣＴ演算を行う。ＩＭＤＣＴ演算の一般式は、（数４）などで与えられる。 The IMDCT unit 603 performs frequency-time conversion on the low-frequency part MDCT coefficient output from the inverse quantization unit 602 using IMDCT, and outputs a low-frequency part audio signal on the time axis. That is, when the output of the inverse quantization unit 602 is used as the input of the IMDCT unit 603, an audio output of 1024 samples per frame is obtained. Here, the IMDCT unit 603 performs IMDCT calculation of 1024 samples. A general expression of the IMDCT calculation is given by (Equation 4) and the like.

一方、符号化列分離部６０１において音響符号化ビットストリームから分離された拡張オーディオ符号化列は、ＢＷＥデコード部６０５に出力される。併せて、逆量子化部６０２の出力である０番目から（maxline−１）番目までの低域ＭＤＣＴ係数およびノイズ発生部６０４の出力は、ＢＷＥデコード部６０５へと入力される。ＢＷＥデコード部６０５の動作の詳細については後述するが、分離された拡張オーディオ符号化列を復号化して得られた拡張周波数スペクトル情報に基づいて、（maxline）番目から２０４７番目までに相当する高域部ＭＤＣＴ係数の復号化および逆量子化を行い、逆量子化部６０２で得られる０番目から（maxline−１）番目までの低域部ＭＤＣＴ係数に加算して、０番目から２０４７番目までに相当する広帯域ＭＤＣＴ係数を出力する。拡張ＩＭＤＣＴ部６０６では、ＩＭＤＣＴ部６０３の２倍のサンプル数のＩＭＤＣＴ演算を行うことにより、１フレームあたり２０４８サンプルの広帯域な出力オーディオ信号を得る。 On the other hand, the extended audio coded sequence separated from the acoustic coded bitstream by the coded sequence separation unit 601 is output to the BWE decoding unit 605. In addition, the 0th to (maxline-1) th low frequency MDCT coefficients and the output of the noise generation unit 604, which are the outputs of the inverse quantization unit 602, are input to the BWE decoding unit 605. The details of the operation of the BWE decoding unit 605 will be described later. Based on the extended frequency spectrum information obtained by decoding the separated extended audio encoded sequence, high frequencies corresponding to the (maxline) th to 2047th The partial MDCT coefficients are decoded and dequantized, and added to the 0th to (maxline-1) th low band MDCT coefficients obtained by the dequantization unit 602, corresponding to the 0th to 2047th Output a wideband MDCT coefficient. The extended IMDCT unit 606 obtains a wideband output audio signal of 2048 samples per frame by performing an IMDCT operation with twice as many samples as the IMDCT unit 603.

以下では、ＢＷＥデコード部６０５のより詳細な動作について説明する。ＢＷＥデコード部６０５では、逆量子化部６０２によって得られるＭＤＣＴ係数０番目から（maxline−１）番目と、拡張オーディオ符号化列を用いて、maxline番目からtargetline番目までのＭＤＣＴ係数を復元する。startline，endline，maxline，targetline，sbw，shiftlenはいずれも符号化装置２００側のＢＷＥエンコード部２０４で用いたものと同じ値である。拡張オーディオ符号化列には、図４（ｃ）に示したように、高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数が低域サブバンドＡ〜Ｄのいずれのサブバンドで代替されるかを示した情報が符号化されているので、その情報に基づいて、高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数を、指定された低域サブバンドＡ〜ＤのＭＤＣＴ係数で各々代替する。 Hereinafter, a more detailed operation of the BWE decoding unit 605 will be described. The BWE decoding unit 605 restores the MDCT coefficients from the maxline to the targetline using the 0th to (maxline-1) MDCT coefficients obtained by the inverse quantization unit 602 and the extended audio coded sequence. Startline, endline, maxline, targetline, sbw, and shiftlen are all the same values as those used in the BWE encoding unit 204 on the encoding apparatus 200 side. As shown in FIG. 4C, the extended audio coded sequence indicates which subband of the low frequency subbands A to D is substituted for the MDCT coefficients of the high frequency subbands h0 to h7. Since the information is encoded, the MDCT coefficients of the high frequency subbands h0 to h7 are respectively replaced with the MDCT coefficients of the designated low frequency subbands A to D based on the information.

その結果、ＢＷＥデコード部６０５では、０番目から（targetline）番目までのＭＤＣＴ係数を得る。さらにＢＷＥデコード部６０５では、拡張オーディオ符号化列にあるゲイン情報をもとにゲイン制御を行う。図３（ｂ）に示すように、ＢＷＥデコード部６０５は、maxlineからtargetlineまでの各高域サブバンドｈ０〜ｈ７に、低域部サブバンドＡ〜Ｄによって代替されるＭＤＣＴ係数列を生成する。さらに、ＢＷＥデコード部６０５は、高域サブバンドｈ０における代替ＭＤＣＴ係数がr(0)，r(1)，...，r(sbw-1)で、拡張オーディオ符号化列から得られるゲイン情報が高域サブバンドｈ０についてｇ０である時、（数５）で与えられる関係式により、図３（ｃ）に示すゲイン制御を施したＭＤＣＴ係数列を得ることができる。すなわち、高域サブバンドｈ０におけるＭＤＣＴ係数をy(0)，y(1)，...，y(sbw-1)とすると、ゲイン制御を施したｉ番目のＭＤＣＴ係数ｙ（ｉ）の値は以下の数５で表される。 As a result, the BWE decoding unit 605 obtains the 0th to (targetline) MDCT coefficients. Further, the BWE decoding unit 605 performs gain control based on the gain information in the extended audio encoded sequence. As shown in FIG. 3B, the BWE decoding unit 605 generates an MDCT coefficient sequence that is substituted by the low-frequency subbands A to D in the high-frequency subbands h0 to h7 from maxline to targetline. Further, the BWE decoding unit 605 has gain information obtained from the extended audio coded sequence when the alternative MDCT coefficients in the high frequency subband h0 are r (0), r (1),..., R (sbw-1). Is g0 for the high frequency subband h0, the MDCT coefficient sequence subjected to gain control shown in FIG. 3C can be obtained from the relational expression given by (Equation 5). That is, if the MDCT coefficients in the high frequency sub-band h0 are y (0), y (1), ..., y (sbw-1), the value of the i-th MDCT coefficient y (i) subjected to gain control Is expressed by Equation 5 below.

同様に、高域サブバンドｈ１〜ｈ７も、各々の高域サブバンドに対するゲイン情報ｇ１〜ｇ７を、代替によるＭＤＣＴ係数列に乗じることによりゲイン制御したＭＤＣＴ係数列を得ることができる。さらに、ノイズ生成部６０４は、ホワイトノイズ、ピンクノイズ、または低域部ＭＤＣＴ係数列の全部または一部をランダムに組み合わせたノイズなどを生成し、生成されたノイズをゲイン制御されたＭＤＣＴ係数列に付加する。その際、加算されるノイズと低域から複製されるスペクトルで合成されるスペクトルのエネルギーを、（数５）で表されるスペクトルのエネルギーに補正することも可能である。 Similarly, the high-frequency subbands h1 to h7 can also obtain a gain-controlled MDCT coefficient sequence by multiplying the MDCT coefficient sequence by the gain information g1 to g7 for each high-frequency subband. Furthermore, the noise generation unit 604 generates white noise, pink noise, or noise that is a combination of all or part of the low-frequency MDCT coefficient sequence, and the generated noise is converted into a gain-controlled MDCT coefficient sequence. Append. At that time, it is also possible to correct the energy of the spectrum synthesized by the noise to be added and the spectrum replicated from the low frequency to the energy of the spectrum represented by (Equation 5).

本実施の形態１では、（数５）のように、代替されたＭＤＣＴ係数に乗じるゲイン情報を符号化することについて述べたが、ゲイン情報としては相対的なゲインではなく、ＭＤＣＴ係数のエネルギーや平均振幅など絶対的な値を用いて符号化、復号化してもよい。 In the first embodiment, encoding of gain information to be multiplied by the replaced MDCT coefficient as described in (Equation 5) has been described, but the gain information is not a relative gain, but energy of the MDCT coefficient, You may encode and decode using absolute values, such as an average amplitude.

この様にして構成されたＢＷＥデコード部６０５を用いることにより、図４（ｃ）に示したような、少ないデータ量で表された拡張オーディオ符号化列を用いた場合でも、高域部の豊かな広帯域なオーディオ再生音を得ることができる。 By using the BWE decoding unit 605 configured in this way, even when an extended audio coded sequence represented by a small amount of data as shown in FIG. A wide-band audio playback sound can be obtained.

なお、以上ではＡＡＣ方式に従う符号化装置２００および復号化装置６００について説明したが、本発明の符号化装置および復号化装置は、これに限定されず、他の符号化方式を用いたものであってもよい。 Although the encoding apparatus 200 and the decoding apparatus 600 according to the AAC scheme have been described above, the encoding apparatus and the decoding apparatus according to the present invention are not limited to this, and use other encoding schemes. May be.

また、符号化装置２００では、ＭＤＣＴ部２０２からＢＷＥエンコード部２０４に、０番目から２０４７番目までのＭＤＣＴ係数列が出力されるとしたが、ＢＷＥエンコード部２０４では、さらに、量子化部２０３で量子化されたＭＤＣＴ係数を逆量子化して得られる、量子化歪を含んだＭＤＣＴ係数も併せて入力としてもよい。また、ＢＷＥエンコード部２０４は、０番目から（maxline−１）番目までの低域部については、量子化部２０３の出力を逆量子化して得られるＭＤＣＴ係数列を入力とし、（maxline）番目から（targetline−１）番目までの高域部については、ＭＤＣＴ部２０２の出力を入力とするようにしてもよい。 In the encoding apparatus 200, the MDCT coefficient sequence from the 0th to the 2047th is output from the MDCT unit 202 to the BWE encoding unit 204. However, in the BWE encoding unit 204, the quantization unit 203 further performs quantization. An MDCT coefficient including quantization distortion obtained by inverse quantization of the converted MDCT coefficient may also be input. Further, the BWE encoding unit 204 receives, as input, the MDCT coefficient sequence obtained by dequantizing the output of the quantization unit 203 for the 0th to (maxline-1) th low-frequency parts. The output of the MDCT unit 202 may be input to the (targetline-1) th high frequency part.

なお、上記実施の形態１では、拡張周波数スペクトル情報を場合に応じて量子化し、符号化すると説明したが、符号化すべき情報（拡張周波数スペクトル情報）をハフマン符号などの可変長符号化を用いて表現したものを拡張オーディオ符号化列として用いてもよいことは言うまでもない。これに対応して、復号化装置では、拡張オーディオ符号化列を逆量子化せず、ハフマン符号などの可変長符号を復号化するとしてもよい。 In the first embodiment, it has been described that the extended frequency spectrum information is quantized and encoded according to the case. However, information to be encoded (extended frequency spectrum information) is encoded using variable length coding such as a Huffman code. It goes without saying that what is expressed may be used as the extended audio coded sequence. Correspondingly, the decoding apparatus may decode a variable length code such as a Huffman code without dequantizing the extended audio coded sequence.

また、実施の形態１では、本発明の符号化方法および復号化方法を、ＭＰＥＧ−２ＡＡＣおよびＭＰＥＧ−４ＡＡＣに適用する場合について説明したが、これに限らず、ＭＰＥＧ−１ＡｕｄｉｏやＭＰＥＧ−２Ａｕｄｉｏなどの他の符号化方式に適用してもよい。ＭＰＥＧ−１ＡｕｄｉｏやＭＰＥＧ−２Ａｕｄｉｏに用いる際は、拡張オーディオ符号化列を、規格書記載のａｎｃｉｌｌａｒｙ＿ｄａｔａに適用する。 In the first embodiment, the encoding method and the decoding method of the present invention are applied to MPEG-2 AAC and MPEG-4 AAC. However, the present invention is not limited to this, and MPEG-1 Audio and MPEG- 2 You may apply to other encoding systems, such as Audio. When used for MPEG-1 Audio or MPEG-2 Audio, the extended audio coded sequence is applied to ancillary_data described in the standard.

なお、上記実施の形態１では、入力されるオーディオ信号を時間周波数変換して得られる周波数スペクトル（ＭＤＣＴ係数）の範囲内で、高域部サブバンドを低域サブバンドの周波数スペクトルによって代替すると説明したが、本発明はこれに限定されず、時間周波数変換によって出力される周波数スペクトルの周波数上限を越える領域にまで代替するとしてもよい。ただし、この場合、原音を表す高域周波数スペクトル（ＭＤＣＴ係数）に基づいて代替に用いる低域サブバンドを特定することはできない。 In the first embodiment, the high frequency subband is replaced with the frequency spectrum of the low frequency subband within the range of the frequency spectrum (MDCT coefficient) obtained by time-frequency conversion of the input audio signal. However, the present invention is not limited to this, and it may be replaced by a region exceeding the upper limit of the frequency spectrum output by time-frequency conversion. However, in this case, it is not possible to specify a low-frequency subband used for substitution based on a high-frequency spectrum (MDCT coefficient) representing the original sound.

（実施の形態２）
本発明の実施の形態２において、実施の形態１と異なる点は、実施の形態１のＢＷＥエンコード部２０４では、startlineからendlineまでの低周波帯域のＭＤＣＴ係数列を、Ａ〜Ｄまでの４つのサブバンドに分割したが、実施の形態２のＢＷＥエンコード部では、startlineからendlineまでの同じ帯域を、重複を許してＡ〜Ｇまでの７つのサブバンドに分割したことである。 (Embodiment 2)
In the second embodiment of the present invention, the difference from the first embodiment is that the BWE encoding unit 204 of the first embodiment uses four low frequency band MDCT coefficient sequences from startline to endline. Although it is divided into subbands, in the BWE encoding unit of the second embodiment, the same band from startline to endline is divided into seven subbands from A to G allowing duplication.

なお、実施の形態２における符号化装置および復号化装置は、実施の形態１における符号化装置２００および復号化装置６００と基本的には同様の構成であり、符号化装置においてはＢＷＥエンコード部７０１の処理、復号化装置においてはＢＷＥデコード部７０２の処理が異なるのみである。従って、本実施の形態２においては、ＢＷＥエンコード部７０１およびＢＷＥデコード部７０２のみ参照符号を変更して説明し、実施の形態１において符号化装置２００および復号化装置６００に関し、すでに説明した各構成要素については同一の参照符号を付して説明を省略する。なお、以下の実施の形態においても、既に説明した部分と異なる点についてのみ説明し、既に説明した部分については省略する。 Note that the encoding apparatus and decoding apparatus in Embodiment 2 have basically the same configuration as encoding apparatus 200 and decoding apparatus 600 in Embodiment 1, and in the encoding apparatus, BWE encoding section 701 is used. However, only the processing of the BWE decoding unit 702 is different in this processing and the decoding device. Therefore, in the second embodiment, only the BWE encoding unit 701 and the BWE decoding unit 702 are described by changing the reference codes, and each of the components already described with respect to the encoding device 200 and the decoding device 600 in the first embodiment is described. Elements are denoted by the same reference numerals and description thereof is omitted. In the following embodiments, only points different from those already described will be described, and the portions already described will be omitted.

以下では、図６を用いて、実施の形態２のＢＷＥエンコード部７０１について説明する。図６は、実施の形態２のＢＷＥエンコード部７０１による拡張周波数スペクトル情報生成方法を示す図である。同図において、低域サブバンドＥ、Ｆ，Ｇは、実施の形態１と同様に分割された低域サブバンドＡ，Ｂ，Ｃ、Ｄのうち、低域サブバンドＡ，Ｂ，Ｃを高域側に、sbw/2シフトして得られるサブバンドである。 Below, the BWE encoding part 701 of Embodiment 2 is demonstrated using FIG. FIG. 6 is a diagram illustrating a method of generating extended frequency spectrum information by the BWE encoding unit 701 according to the second embodiment. In the figure, the low-frequency subbands E, F, and G are the low-frequency subbands A, B, C, and D of the low-frequency subbands A, B, C, and D divided as in the first embodiment. This is a subband obtained by shifting sbw / 2 on the band side.

ここでは、低域サブバンドＡ，Ｂ，Ｃを高域方向にsbw/2ずつシフトするとしているが、重複を許す帯域の分割方法や、シフトの周波数幅、および分割の個数などは必ずしもこれに限定されない。ＢＷＥエンコード部７０１では、高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数列を代替する、Ａ〜Ｇまでの７つの低域サブバンドの１つを特定する情報を、高域サブバンドｈ０〜ｈ７の各々について生成および符号化し、拡張オーディオ符号化列として出力する。 Here, the low-frequency subbands A, B, and C are shifted by sbw / 2 in the high-frequency direction, but the band division method that allows overlapping, the frequency width of the shift, the number of divisions, etc. are not necessarily limited to this. It is not limited. The BWE encoding unit 701 replaces the MDCT coefficient sequence of the high frequency subbands h0 to h7 with information specifying one of the seven low frequency subbands A to G, for each of the high frequency subbands h0 to h7. Is generated and encoded, and output as an extended audio encoded sequence.

一方、本実施の形態２における復号化装置では、実施の形態２の符号化装置（符号化装置２００におけるＢＷＥエンコード部２０４の代わりにＢＷＥエンコード部７０１を備えたもの）によって符号化された拡張オーディオ符号化列を入力として、高域サブバンドｈ０〜ｈ７が、低域サブバンドＡ〜ＧのいずれのサブバンドのＭＤＣＴ係数で代替されたかを特定する情報を復号化し、高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数を、低域サブバンドＡ〜ＧのＭＤＣＴ係数で代替する。 On the other hand, in the decoding apparatus according to the second embodiment, the extended audio encoded by the encoding apparatus according to the second embodiment (including the BWE encoding unit 701 instead of the BWE encoding unit 204 in the encoding apparatus 200). Using the coded sequence as input, information specifying which subband MDCT coefficients of the low frequency subbands A to G are replaced with the high frequency subbands h0 to h7 is decoded, and the high frequency subbands h0 to h7 are decoded. The MDCT coefficients of the low frequency subbands A to G are replaced with the MDCT coefficients of the low frequency band.

また、低域サブバンドＡ〜Ｇのいずれか１つを特定する情報を、例えば３ビットの符号情報を用いて表す場合、符号情報として「０」から「６」までの整数値が、それぞれ低域サブバンドＡ〜Ｇを表すとすると、符号情報の値が「７」となる符号情報を作成した場合、復号化装置では、Ａ〜Ｇのいずれを用いても代替しないという制御を行うとしてもよい。なお、ここでは符号情報として３ビットの情報を用い、符号情報の値として「７」の場合について述べたが、符号情報のビット数や、符号情報の値は他の値でも構わない。 In addition, when information specifying any one of the low frequency sub-bands A to G is expressed using, for example, 3-bit code information, integer values from “0” to “6” are low as the code information. Assuming that the subbands A to G are represented, when code information having a code information value of “7” is created, the decoding apparatus may perform control that does not substitute any of A to G. Good. Although the case where 3 bits of information is used as the code information and the value of the code information is “7” is described here, the number of bits of the code information and the value of the code information may be other values.

実施の形態１で用いたゲイン制御および／またはノイズの重畳は、本発明の実施の形態２でも同様に用いる。この様にして作成された符号化装置、復号化装置を用いれば、情報量として大きくない拡張オーディオ符号化列を用いて、広帯域な再生音を得ることができる。 The gain control and / or the noise superposition used in the first embodiment are similarly used in the second embodiment of the present invention. By using the encoding device and the decoding device created in this way, it is possible to obtain a wide-band reproduced sound using an extended audio encoded sequence that is not large as an information amount.

（実施の形態３）
実施の形態３において、実施の形態２と異なる点は、実施の形態２のＢＷＥエンコード部７０１では、startlineからendlineまでの低域部ＭＤＣＴ係数を、周波数軸方向の重複を許してＡ〜Ｇまでの７つの低域サブバンドに分割したが、実施の形態３のＢＷＥエンコード部では、startlineからendlineまでの帯域を、Ａ〜Ｇまでの７つのサブバンドに分割し、かつ、低域サブバンド内のＭＤＣＴ係数の順序を反転したものと、低域サブバンド内のＭＤＣＴ係数の正負の符号を反転したものとを定義したことである。 (Embodiment 3)
The third embodiment is different from the second embodiment in that the BWE encoding unit 701 of the second embodiment allows the low-frequency MDCT coefficients from the startline to the endline to be overlapped in the frequency axis direction from A to G. However, in the BWE encoding unit of the third embodiment, the band from startline to endline is divided into seven subbands A to G, and within the low band subband. Are defined by reversing the order of the MDCT coefficients and by reversing the signs of the MDCT coefficients in the low-frequency subband.

実施の形態３においても実施の形態２と同様、実施の形態１の符号化装置２００および復号化装置６００と、構成上異なる点は、符号化装置におけるＢＷＥエンコード部８０１と、復号化装置におけるＢＷＥデコード部８０２のみである。以下では、図７を用いて本実施の形態３のＢＷＥエンコード部について説明する。 Also in the third embodiment, as in the second embodiment, the configuration differs from the encoding device 200 and the decoding device 600 of the first embodiment in that the BWE encoding unit 801 in the encoding device and the BWE in the decoding device are different. Only the decoding unit 802 is provided. Below, the BWE encoding part of this Embodiment 3 is demonstrated using FIG.

図７は、実施の形態３のＢＷＥエンコード部８０１による拡張周波数スペクトル情報生成方法を示す図である。図７（ａ）は、実施の形態２と同様に分割された低域部および高域部のサブバンドを示す図である。図７（ｂ）は、低域サブバンドＡのＭＤＣＴ係数列の一例を示す図である。図７（ｃ）は、低域サブバンドＡのＭＤＣＴ係数列の順序を反転させて得られるサブバンドＡｓのＭＤＣＴ係数列の一例を示す図である。 FIG. 7 is a diagram illustrating an extended frequency spectrum information generation method by the BWE encoding unit 801 according to the third embodiment. FIG. 7A is a diagram showing subbands of the low frequency band and the high frequency band divided in the same manner as in the second embodiment. FIG. 7B is a diagram illustrating an example of the MDCT coefficient sequence of the low frequency subband A. FIG. 7C is a diagram illustrating an example of the MDCT coefficient sequence of the subband As obtained by reversing the order of the MDCT coefficient sequence of the low frequency subband A.

また、図７（ｄ）は、低域サブバンドＡのＭＤＣＴ係数列の符号を反転させて得られるサブバンドＡｒを示す図である。例えば、低域サブバンドＡのＭＤＣＴ係数列を（p0，p1，...，pN）で表す。これにおいて、例えば、p0は、サブバンドＡの０番目のＭＤＣＴ係数の値を表している。サブバンドＡのＭＤＣＴ係数列の順序を周波数方向に反転させて得られるサブバンドＡｓのＭＤＣＴ係数列は(pN，p（N-1），...，p0)である。低域サブバンドＡのＭＤＣＴ係数列の符号を反転させて得られるサブバンドＡｒのＭＤＣＴ係数は(-p0，-p1，...，-pN)で表される。サブバンドＡだけでなく、サブバンドＢ〜Ｇも同様に、順序の反転したサブバンドＢｓ〜Ｇｓと、負号を反転して得られるサブバンドＢｒ〜Ｇｒとを定義する。 FIG. 7D shows a subband Ar obtained by inverting the sign of the MDCT coefficient sequence of the low frequency subband A. For example, the MDCT coefficient sequence of the low frequency subband A is represented by (p0, p1,..., PN). In this case, for example, p0 represents the value of the 0th MDCT coefficient of subband A. The MDCT coefficient sequence of the subband As obtained by inverting the order of the MDCT coefficient sequence of the subband A in the frequency direction is (pN, p (N−1),..., P0). The MDCT coefficients of the subband Ar obtained by inverting the sign of the MDCT coefficient sequence of the low frequency subband A are represented by (−p0, −p1,..., −pN). Similarly, not only the subband A but also the subbands B to G define subbands Bs to Gs whose order is reversed and subbands Br to Gr obtained by reversing the negative sign.

この様にして、本実施の形態３におけるＢＷＥエンコード部８０１では、高域サブバンドｈ０〜ｈ７の各々について、Ａ〜Ｇまでの７つの低域サブバンドのいずれか、または７つの低域サブバンドＡ〜ＧのＭＤＣＴ係数列の順番や符号を反転して得られるそれぞれ７つの低域サブバンドＡｓ〜Ｇｓおよび低域サブバンドＡｒ〜Ｇｒのうちからいずれか、高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数を代替する１つを特定する。 In this way, in the BWE encoding unit 801 in the third embodiment, for each of the high frequency subbands h0 to h7, one of the seven low frequency subbands A to G or the seven low frequency subbands. MDCT of high-frequency subbands h0 to h7, each of seven low-frequency subbands As to Gs and low-frequency subbands Ar to Gr obtained by inverting the order and signs of the MDCT coefficient sequences of A to G Identify one that replaces the coefficient.

ＢＷＥエンコード部８０１は、特定された低域サブバンドを用いて高域部ＭＤＣＴ係数列を表すための情報を符号化し、図４(ｃ)に示したような拡張オーディオ符号化列を生成する。この場合、拡張周波数スペクトル情報として、各高域サブバンドにつき、高域サブバンドのＭＤＣＴ係数を代替する低域サブバンドを特定する情報と、特定された低域サブバンドのＭＤＣＴ係数の順序を反転するか否かを示す情報と、特定された低域サブバンドＭＤＣＴ係数の正負の符号を反転するか否かを示す情報とを符号化する。 The BWE encoding unit 801 encodes information for representing a high-frequency part MDCT coefficient sequence using the specified low-frequency subband, and generates an extended audio encoded sequence as illustrated in FIG. In this case, as the extended frequency spectrum information, for each high frequency subband, the order of the information specifying the low frequency subband substituting the MDCT coefficient of the high frequency subband and the MDCT coefficient of the specified low frequency subband are reversed. The information indicating whether or not to perform and the information indicating whether or not to reverse the sign of the identified low frequency subband MDCT coefficient are encoded.

一方、本実施の形態３における復号化装置では、上記のように本実施の形態３における符号化装置によって符号化された拡張オーディオ符号化列を入力として、高域サブバンドｈ０〜ｈ７が、低域サブバンドＡ〜ＧのいずれのＭＤＣＴ係数で代替されているか、ＭＤＣＴ係数の順序を反転するか否か、およびＭＤＣＴ係数の正負の符号を反転させるか否かを示した拡張周波数スペクトル情報を復号化する。次いで、復号化された拡張周波数スペクトル情報に従って、特定された低域サブバンドＡ〜ＧのＭＤＣＴ係数を、ＭＤＣＴ係数の順序を反転したり、正負の符号を反転したりして、高域サブバンドｈ０〜ｈ７のＭＤＣＴ係数を生成する。 On the other hand, in the decoding apparatus according to the third embodiment, the high frequency sub-bands h0 to h7 are set to be low by using the extended audio coded sequence encoded by the coding apparatus according to the third embodiment as described above. Decodes extended frequency spectrum information indicating which MDCT coefficients in the subbands A to G are replaced, whether to reverse the order of the MDCT coefficients, and whether to reverse the sign of the MDCT coefficients Turn into. Next, according to the decoded extended frequency spectrum information, the MDCT coefficients of the identified low-frequency subbands A to G are reversed by changing the order of the MDCT coefficients or by inverting the positive / negative sign. Generate MDCT coefficients of h0 to h7.

さらに、本実施の形態３では、低域サブバンドのＭＤＣＴ係数の順序と正負の符号とについてのみの拡張ではなく、低域サブバンドのＭＤＣＴ係数をフィルタ処理したものによる代替を含む。なお、フィルタ処理とは、例えばＩＩＲフィルタおよびＦＩＲフィルタなどであり、当業者では明らかな技術であるので説明を省略する。このようなフィルタ処理を行う場合、符号化装置側で、拡張オーディオ符号化列の中にフィルタの係数を符号化しておくことにより、復号化装置側では、特定された低域サブバンドのＭＤＣＴ係数に、復号化されたフィルタ係数で示されるＩＩＲフィルタやＦＩＲフィルタを施し、フィルタリングが施されたＭＤＣＴ係数を用いて高域サブバンドを代替することができる。 Furthermore, the third embodiment includes not only an extension of the order and positive and negative signs of the low-frequency subband MDCT coefficients but also an alternative by filtering the low-frequency subband MDCT coefficients. Note that the filter processing is, for example, an IIR filter, an FIR filter, and the like, and is a technique that is obvious to those skilled in the art. When performing such filter processing, the encoding device side encodes the filter coefficient in the extended audio encoded sequence, so that the decoding device side can specify the MDCT coefficient of the specified low-frequency subband. Further, the IIR filter or FIR filter indicated by the decoded filter coefficient is applied, and the high-frequency subband can be replaced using the filtered MDCT coefficient.

なお、実施の形態１で用いたゲイン制御は、実施の形態３でも同様に用いることができる。以上のように構成された符号化装置、復号化装置を用いれば、情報量として大きくない拡張オーディオ符号化列を用いて、広帯域な再生音を得ることができる。 The gain control used in the first embodiment can be similarly used in the third embodiment. By using the encoding apparatus and decoding apparatus configured as described above, it is possible to obtain a wide-band reproduced sound using an extended audio encoded sequence that is not large in amount of information.

（実施の形態４）
本実施の形態４において、実施の形態３と異なる点は、本実施の形態４の復号化装置では、高域部サブバンドｈ０〜ｈ７のＭＤＣＴ係数を、特定された低域サブバンドＡ〜ＧのＭＤＣＴ係数だけを用いて代替するのではなく、特定された低域サブバンドＡ〜ＧのＭＤＣＴ係数に併せて、ノイズ生成部によって生成されたＭＤＣＴ係数を用いて代替する点である。従って、本実施の形態４における復号化装置は、実施の形態１の復号化装置６００と、ノイズ生成部９０１およびＢＷＥデコード部９０２の構成が異なるのみである。 (Embodiment 4)
The fourth embodiment differs from the third embodiment in that the decoding apparatus of the fourth embodiment uses the MDCT coefficients of the high frequency subbands h0 to h7 as the specified low frequency subbands A to G. Instead of using only the MDCT coefficients of the low frequency subbands A to G, the MDCT coefficients generated by the noise generation unit are used for replacement. Therefore, the decoding apparatus according to the fourth embodiment is different from the decoding apparatus 600 according to the first embodiment only in the configuration of the noise generation unit 901 and the BWE decoding unit 902.

以下、本実施の形態４の復号化装置における拡張オーディオ符号化列の復号化処理について、例えば、ＢＷＥデコードされる高域サブバンドｈ０が、低域サブバンドＡを用いて代替される場合について図８を用いて説明する。図８（ａ）は、高域サブバンドｈ０に対して特定された低域サブバンドＡのＭＤＣＴ係数の一例を示す図である。図８（ｂ）は、ノイズ生成部９０１によって生成される低域サブバンドＡと同数のＭＤＣＴ係数の一例を示す図である。図８（ｃ）は、図８（ａ）に示した低域サブバンドＡのＭＤＣＴ係数と、図８（ｂ）に示したノイズ生成部９０１によるＭＤＣＴ係数とを用いて生成される、高域サブバンドｈ０を代替するＭＤＣＴ係数の一例を示す図である。ここで、低域サブバンドＡのＭＤＣＴ係数列を、Ａ＝(p0，p1，...，pN)とする。 Hereinafter, with regard to the decoding process of the extended audio coded sequence in the decoding apparatus according to the fourth embodiment, for example, a case where the high frequency subband h0 to be BWE decoded is replaced by using the low frequency subband A is illustrated. 8 will be used for explanation. FIG. 8A is a diagram illustrating an example of the MDCT coefficients of the low frequency subband A specified for the high frequency subband h0. FIG. 8B is a diagram illustrating an example of the same number of MDCT coefficients as the low frequency subband A generated by the noise generation unit 901. FIG. 8C illustrates a high frequency band generated using the MDCT coefficient of the low frequency sub-band A illustrated in FIG. 8A and the MDCT coefficient generated by the noise generation unit 901 illustrated in FIG. It is a figure which shows an example of the MDCT coefficient which substitutes subband h0. Here, the MDCT coefficient sequence of the low-frequency subband A is A = (p0, p1,..., PN).

また、ノイズ生成部９０１では、低域サブバンドＡと同じ数Ｎのノイズ信号ＭＤＣＴ係数列、Ｍ＝(n0,n1, ...,nN)が得られるとする。ＢＷＥデコード部９０２では、低域サブバンドＡのＭＤＣＴ係数列Ａと、ノイズ信号ＭＤＣＴ係数列Ｍとを、重み係数α、βを用いて調整し、高域サブバンドｈ０のＭＤＣＴ係数を代替する代替ＭＤＣＴ係数列Ａ'を生成する。代替係数列Ａ'は、以下の数式（数６）で表される。 In addition, it is assumed that the noise generation unit 901 obtains the same number N of noise signal MDCT coefficient sequences as M = (n0, n1,..., NN) as in the low frequency subband A. The BWE decoding unit 902 substitutes the MDCT coefficient of the high frequency subband h0 by adjusting the MDCT coefficient sequence A of the low frequency subband A and the noise signal MDCT coefficient sequence M using the weighting coefficients α and β. An MDCT coefficient sequence A ′ is generated. The substitution coefficient sequence A ′ is expressed by the following mathematical formula (Formula 6).

なお、重み係数α、βは本実施の形態４における復号化装置において、予め設定された値でもよいし、符号化装置側で、重み係数α、βの値を示す制御情報を拡張オーディオ符号化列中に符号化しておき、復号化装置において復元したものを用いるとしてもよい。 Note that the weighting factors α and β may be preset values in the decoding apparatus according to the fourth embodiment, or the control information indicating the values of the weighting factors α and β is extended audio coding on the encoding device side. It is also possible to use what is encoded in the sequence and restored in the decoding apparatus.

ここでは、ＢＷＥデコード部９０２で出力されるサブバンドｈ０を例にとって説明したが、他の高域サブバンドｈ１〜ｈ７についても同様の処理を行う。また、代替される低域サブバンドとして低域サブバンドＡを例に説明したが、逆量子化部から得られる他の低域サブバンドであってもよく、その場合の処理も同様である。また、重み係数α、βとしては、一方が「０」のとき他方が「１」となるような値をとってもよいし、「α＋β」が「１」となるような値をとってもよい。 Although the subband h0 output from the BWE decoding unit 902 has been described as an example here, the same processing is performed for the other high frequency subbands h1 to h7. Moreover, although the low-frequency subband A has been described as an example of the low-frequency subband to be replaced, another low-frequency subband obtained from the inverse quantization unit may be used, and the processing in that case is the same. The weighting coefficients α and β may take values such that when one is “0”, the other is “1”, or “α + β” may be “1”.

この場合、例えば、α＝０のとき、高域サブバンドのＭＤＣＴ係数列とノイズ情報のＭＤＣＴ係数列のエネルギー比を求め、得られたエネルギー比を、ノイズ情報のＭＤＣＴ係数列に対するゲイン情報として拡張オーディオ符号化列に符号化するものとする。さらに、重み係数αと重み係数βとの比を表す値を符号化してもよい。また、ＢＷＥデコード部９０２によってコピーされる１つの低域サブバンドのＭＤＣＴ係数が全て「０」である場合には、αの値によらずβの値を「１」に設定するなどの制御を行うとしてもよい。 In this case, for example, when α = 0, the energy ratio between the MDCT coefficient sequence of the high frequency subband and the MDCT coefficient sequence of the noise information is obtained, and the obtained energy ratio is expanded as gain information for the MDCT coefficient sequence of the noise information. It shall be encoded into an audio encoding sequence. Furthermore, a value representing the ratio between the weighting factor α and the weighting factor β may be encoded. Further, when all MDCT coefficients of one low-frequency subband copied by the BWE decoding unit 902 are “0”, control is performed such as setting the value of β to “1” regardless of the value of α. It may be done.

ノイズ生成部９０１は、予め用意されたテーブルを内部に保持しておき、そのテーブルにある値をノイズ信号ＭＤＣＴ係数列として出力する構成でもよいし、時間領域のノイズ信号をＭＤＣＴ変換することによって得られるノイズ信号ＭＤＣＴ係数列を毎フレーム作成する構成でもよい。また、時間領域のノイズ信号に対して時間領域でゲインコントロールし、ゲインコントロールされた信号列をＭＤＣＴ変換して得られるＭＤＣＴ係数列の全部または一部を用いてノイズ信号ＭＤＣＴ係数列を出力する構成でもよい。 The noise generation unit 901 may have a configuration in which a table prepared in advance is stored therein, and values in the table are output as a noise signal MDCT coefficient sequence, or may be obtained by MDCT conversion of a noise signal in the time domain. The configuration may be such that the generated noise signal MDCT coefficient sequence is generated every frame. Further, gain control is performed for a noise signal in the time domain in the time domain, and a noise signal MDCT coefficient sequence is output using all or part of the MDCT coefficient sequence obtained by MDCT conversion of the gain-controlled signal sequence. But you can.

特に、時間領域のノイズ信号を時間領域でゲインコントロールし、ＭＤＣＴ変換して得られるＭＤＣＴ係数列を用いる場合には、再生音のプリエコーを抑圧する効果を期待することができる。その際、ノイズ信号に対して時間領域でゲインコントロールするためのゲインの制御情報は、実施の形態４における符号化装置側で予め符号化しておき、復号化装置側ではそれを復号化して用いるとしてもよい。この様にして構成された復号化装置を用いれば、低域サブバンドのＭＤＣＴ係数によって、ＢＷＥデコードされる高域サブバンドのＭＤＣＴ係数を、十分に表現できない場合においても、ノイズ信号のＭＤＣＴ係数を用いることで、極端にトナリティーを上げることなく、広帯域化を図れる効果が期待される。 In particular, when an MDCT coefficient sequence obtained by controlling the gain of a noise signal in the time domain and performing MDCT conversion is used, an effect of suppressing the pre-echo of the reproduced sound can be expected. At that time, the gain control information for gain control in the time domain for the noise signal is encoded in advance on the encoding device side in the fourth embodiment, and the decoding device side decodes and uses it. Also good. When the decoding apparatus configured in this way is used, the MDCT coefficient of the noise signal can be obtained even when the MDCT coefficient of the high frequency subband to be BWE decoded cannot be sufficiently expressed by the MDCT coefficient of the low frequency subband. By using it, it is expected that the bandwidth can be increased without extremely increasing tonality.

（実施の形態５）
本実施の形態５において、実施の形態４と異なる点は、複数の時間フレームを１つにまとめて制御可能するように機能を拡張した点である。本発明の実施の形態５の符号化装置および復号化装置におけるＢＷＥエンコード部１００１およびＢＷＥデコード部１００２の動作に付いて、図９および図１０を用いて説明する。 (Embodiment 5)
The fifth embodiment is different from the fourth embodiment in that the function is expanded so that a plurality of time frames can be controlled together. Operations of the BWE encoding unit 1001 and the BWE decoding unit 1002 in the encoding device and decoding device according to the fifth embodiment of the present invention will be described with reference to FIGS.

図９（ａ）は、時刻t0における１フレームのＭＤＣＴ係数を示す図である。図９（ｂ）は、時刻t1における次のフレームのＭＤＣＴ係数を示す図である。図９（ｃ）は、時刻t2におけるさらに次のフレームのＭＤＣＴ係数を示す図である。時刻t0，t1，t2は連続する時間であり、フレームに同期した時刻であるとする。実施の形態１から実施の形態４では、各々の時刻t0，t1，t2においてそれぞれ拡張オーディオ符号化列を生成したが、実施の形態５の符号化装置においては、複数の連続するフレームの拡張オーディオ符号化列を共通に生成する。同図では、連続するフレームの数が３つの場合を示したが、連続するフレームの数はいくつでもでよい。 FIG. 9A shows the MDCT coefficient of one frame at time t0. FIG. 9B shows the MDCT coefficient of the next frame at time t1. FIG. 9C shows the MDCT coefficient of the next frame at time t2. Times t0, t1, and t2 are continuous times, and are time synchronized with the frame. In the first to fourth embodiments, the extended audio coded sequence is generated at each of the times t0, t1, and t2. In the coding apparatus according to the fifth embodiment, the extended audio of a plurality of consecutive frames is used. An encoded sequence is generated in common. Although the figure shows a case where the number of consecutive frames is three, any number of consecutive frames may be used.

実施の形態１の図４（ｃ）では、拡張オーディオ符号化列の先頭部に、直前フレームの拡張オーディオ符号化列と同じ方法で分割された低域サブバンドＡ〜Ｄを使用するか否かを示す項目を備えたが、本実施の形態５のＢＷＥエンコード部１００１は、これと同様に、各フレームの拡張オーディオ符号化列の先頭部に、直前フレームと同じの拡張オーディオ符号化列を使用するか否かを示す項目を設ける。以下では、例えば、時刻t0，t1，t2の各フレームにおいて、各フレームの高域サブバンドを、時刻t0フレームの拡張オーディオ符号化列を用いて復号化する場合について説明する。 In FIG. 4C of the first embodiment, whether or not the low frequency subbands A to D divided by the same method as the extended audio encoded sequence of the immediately preceding frame are used at the head of the extended audio encoded sequence. In the same manner as this, the BWE encoding unit 1001 of the fifth embodiment uses the same extended audio encoded sequence as that of the immediately preceding frame at the beginning of the extended audio encoded sequence of each frame. An item indicating whether or not to perform is provided. Hereinafter, for example, a case will be described in which, in each frame at time t0, t1, and t2, the high frequency subband of each frame is decoded using the extended audio encoded sequence of time t0 frame.

実施の形態５の復号化装置では、複数の連続するフレームに共通に生成された拡張オーディオ符号化列を入力として、各フレームのＢＷＥデコードを行う。例えば、ＢＷＥデコード部１００２は、時刻t0フレームにおける高域サブバンドｈ０が、同じ時刻t0フレームの低域サブバンドＣで代替されている場合、時刻t1フレームにおける高域サブバンドｈ０も時刻t1フレームの低域サブバンドＣを用いて復号化し、同様に時刻t2フレームにおける高域サブバンドｈ０も時刻t2フレームの低域サブバンドＣを用いて復号化する。 The decoding apparatus according to the fifth embodiment performs BWE decoding of each frame with an extended audio coded sequence generated in common for a plurality of consecutive frames as an input. For example, when the high frequency subband h0 in the time t0 frame is replaced with the low frequency subband C in the same time t0 frame, the BWE decoding unit 1002 also converts the high frequency subband h0 in the time t1 frame into the time t1 frame. Decoding is performed using the low frequency subband C, and similarly, the high frequency subband h0 in the time t2 frame is also decoded using the low frequency subband C in the time t2 frame.

ＢＷＥデコード部１００２は、他の高域サブバンドｈ１〜ｈ７についても同様の処理を行う。この様にして構成された符号化装置および復号化装置を用いれば、同じ拡張オーディオ符号化列を使用する複数フレームに対して、全体的に、オーディオ符号化ビットストリーム中に拡張オーディオ符号化列が占める領域を小さく抑えることができ、より効率的な符号化および復号化を実現することができる。 The BWE decoding unit 1002 performs the same processing for the other high frequency subbands h1 to h7. If the encoding device and the decoding device configured in this way are used, the extended audio encoded sequence is entirely included in the audio encoded bitstream for a plurality of frames using the same extended audio encoded sequence. The occupied area can be kept small, and more efficient encoding and decoding can be realized.

また、以下では、本実施の形態５における符号化装置および復号化装置の他の例について、図１０を用いて説明する。この例において、前述の例と異なる点は、ＢＷＥエンコード部１１０１は、複数の連続フレームにおいて同じ拡張オーディオ符号化列を用いて復号化される高域部ＭＤＣＴ係数列を、フレームごとに異なるゲインでゲインコントロールするためのゲイン情報を、拡張オーディオ符号化列に符号化する点である。 Hereinafter, another example of the encoding device and the decoding device according to Embodiment 5 will be described with reference to FIG. In this example, the difference from the above example is that the BWE encoding unit 1101 converts the high frequency part MDCT coefficient sequence decoded using the same extended audio encoded sequence in a plurality of consecutive frames with a gain different for each frame. The gain information for gain control is encoded into an extended audio encoded sequence.

図１０も、図９と同様に、時刻t0，t1，t2において連続する複数のフレームにおけるＭＤＣＴ係数列を示す図である。実施の形態５における他の符号化装置では、複数フレームにおいてＢＷＥデコードされる高域部ＭＤＣＴ係数のゲインの相対値を、拡張オーディオ符号化列に生成する。例えば、ＢＷＥデコードされる帯域（maxline から targetlineまでの高域部）のＭＤＣＴ係数の平均振幅を、時刻t0，t1，t2フレームに対して各々G0，G1，G2とする。 FIG. 10 is also a diagram showing an MDCT coefficient sequence in a plurality of consecutive frames at times t0, t1, and t2, as in FIG. In another encoding apparatus according to Embodiment 5, the relative value of the gain of the high frequency MDCT coefficient that is BWE decoded in a plurality of frames is generated in the extended audio encoded sequence. For example, the average amplitude of the MDCT coefficient in the BWE-decoded band (high band from maxline to targetline) is set to G0, G1, and G2 for the time t0, t1, and t2 frames, respectively.

まず、時刻t0，t1，t2フレームの中で、レファレンスとなるフレームを決定する。レファレンスとなるフレームは、例えば、最初の時刻t0フレームなどに予め決定しておいてもよいし、また、例えば、最大の平均振幅を与えるフレームをレファレンスとして決定しておき、最大の平均振幅を与えるフレームの位置を示す情報を拡張オーディオ符号化列中に別途、符号化しておいてもよい。 First, a frame serving as a reference is determined among the frames at time t0, t1, and t2. The reference frame may be determined in advance, for example, at the first time t0 frame. For example, a frame that gives the maximum average amplitude is determined as a reference, and the maximum average amplitude is given. Information indicating the position of the frame may be separately encoded in the extended audio encoded sequence.

ここでは、例えば、時刻t0フレームにおける平均振幅G0が、同じ拡張オーディオ符号化列を用いて高域部ＭＤＣＴ係数列が復号化される連続フレームにおける、最大の平均振幅であるとする。この場合、時刻t1フレームにおける高域部平均振幅は、レファレンスである時刻t0フレームに対して、G1 / G0で表され、時刻t2フレームにおける高域部平均振幅は、時刻t0フレームに対してG2 / G0で表される。ＢＷＥエンコード部１１０１は、これら、高域部平均振幅の相対値G1 / G0，G2 / G0などを量子化して拡張オーディオ符号化列に符号化する。 Here, for example, it is assumed that the average amplitude G0 in the time t0 frame is the maximum average amplitude in consecutive frames in which the high frequency MDCT coefficient sequence is decoded using the same extended audio encoded sequence. In this case, the high frequency average amplitude in the time t1 frame is expressed as G1 / G0 with respect to the reference time t0 frame, and the high frequency average amplitude in the time t2 frame is G2 / Represented by G0. The BWE encoding unit 1101 quantizes these high band average amplitude relative values G1 / G0, G2 / G0, etc., and encodes them into an extended audio encoded sequence.

一方、実施の形態５における他の復号化装置では、ＢＷＥデコード部１１０２は、拡張オーディオ符号化列を入力として、レファレンスとなるフレームを拡張オーディオ符号化列から特定して復号化し、もしくは予め決定されているフレームを復号化し、レファレンスとなるフレームの平均振幅値を復号化する。さらに、ＢＷＥデコードされる高域部ＭＤＣＴ係数列のレファレンスフレームに対する相対的な平均振幅値を復号化し、共通の拡張オーディオ符号化列に従って復号化された各フレームの高域部ＭＤＣＴ係数列をゲインコントロールする。 On the other hand, in another decoding apparatus according to the fifth embodiment, BWE decoding section 1102 receives an extended audio coded sequence as an input, specifies a frame serving as a reference from the extended audio coded sequence, decodes it, or is determined in advance. And the average amplitude value of the reference frame is decoded. Further, the relative average amplitude value of the high frequency MDCT coefficient sequence to be BWE decoded with respect to the reference frame is decoded, and the high frequency MDCT coefficient sequence of each frame decoded according to the common extended audio encoded sequence is gain controlled. To do.

このように、図１０に示したＢＷＥデコード部１１０２によれば、共通の拡張オーディオ符号化列を使用して復号化された複数フレームに対して、ＭＤＣＴ係数の平均振幅の補正を容易に行うことができる。これによって、少ないデータ量で、より原音に忠実な広帯域オーディオ信号を再生できるオーディオ符号化列を符号化し、復号化することができる。 As described above, according to the BWE decoding unit 1102 illustrated in FIG. 10, the average amplitude of the MDCT coefficient can be easily corrected for a plurality of frames decoded using a common extended audio coded sequence. Can do. As a result, it is possible to encode and decode an audio encoded sequence that can reproduce a wideband audio signal that is more faithful to the original sound with a small amount of data.

（実施の形態６）
本実施の形態６において、実施の形態５と異なる点は、本実施の形態５の符号化装置および復号化装置は、時間軸上のオーディオ信号を、ポリフェーズＱＭＦ（Quadrature Mirror Filter）フィルタを用いて、周波数スペクトルの時間変化を表す時間周波数信号に変換および逆変換する点である。 (Embodiment 6)
The sixth embodiment is different from the fifth embodiment in that the encoding apparatus and decoding apparatus of the fifth embodiment use a polyphase QMF (Quadrature Mirror Filter) filter for the audio signal on the time axis. Thus, it is a point of conversion and inverse conversion into a time-frequency signal representing a time change of the frequency spectrum.

例えば、サンプリング周波数４４．１ｋＨｚでサンプリングされたオーディオ信号の１フレーム、１０２４サンプルのうち、連続する毎３２サンプルを約０．７３ｍｓｅｃごとに周波数変換して、それぞれ３２サンプルからなる周波数スペクトルを得る。１フレーム、１０２４サンプルでは、約０．７３ｍｓｅｃずつ時間差のあるこの周波数スペクトルが、全部で３２個得られる。 For example, out of one frame and 1024 samples of an audio signal sampled at a sampling frequency of 44.1 kHz, 32 consecutive samples are frequency-converted about every 0.73 msec to obtain a frequency spectrum consisting of 32 samples. In one frame and 1024 samples, a total of 32 frequency spectra having a time difference of about 0.73 msec are obtained.

この周波数スペクトルは、それぞれ３２サンプルで０ｋＨｚから最大２２．０５ｋＨｚまでの再生帯域を表している。この周波数スペクトルのうち、同一周波数のスペクトルデータの値を時間方向につないで得られる波形が、ＱＭＦフィルタの出力である時間周波数信号である。本実施の形態の符号化装置は、ＱＭＦフィルタの出力である時間周波数信号のうち、例えば、低域部０番目〜１５番目の時間周波数信号を、従来の符号化装置と同様にして、量子化および可変長符号化する。 This frequency spectrum represents the reproduction band from 0 kHz to a maximum of 22.05 kHz with 32 samples each. Of this frequency spectrum, a waveform obtained by connecting spectral data values of the same frequency in the time direction is a time-frequency signal that is the output of the QMF filter. The encoding apparatus according to the present embodiment quantizes, for example, the low-frequency part 0th to 15th time frequency signals out of the time frequency signals output from the QMF filter in the same manner as the conventional encoding apparatus. And variable length encoding.

一方、高域部１６番目〜３１番目の時間周波数信号については、それぞれを代替する低域部０番目〜１５番目の時間周波数信号の１つを特定し、特定された低域部０番目〜１５番目の時間周波数信号を示す情報と、特定された低域時間周波数信号の振幅を調整するためのゲイン情報とからなる拡張時間周波数信号を生成する。 On the other hand, for the 16th to 31st time frequency signals of the high frequency part, one of the low frequency part 0th to 15th time frequency signals that substitute each other is specified, and the specified low frequency part 0th to 15th An extended time frequency signal including information indicating the th time frequency signal and gain information for adjusting the amplitude of the identified low frequency frequency signal is generated.

なおここで、例えば、パラメータに応じて処理または特性の異なるフィルタを用いる場合には、フィルタの処理内容または特性を特定するためのパラメータを、拡張時間周波数信号に記述しておく。次いで、符号化装置は、低域時間周波数信号を量子化及び可変長符号化して得られた低域部オーディオ符号化列と、拡張時間周波数信号を可変長符号化して得られた高域部符号化列とを、オーディオ符号化ビットストリームに記述して出力する。 Here, for example, when filters having different processes or characteristics are used according to the parameters, parameters for specifying the processing contents or characteristics of the filters are described in the extended time frequency signal. Next, the encoding device includes a low-frequency audio encoded sequence obtained by quantizing and variable-length encoding the low-frequency time frequency signal, and a high-frequency code obtained by variable-length encoding the extended time frequency signal. The encoded sequence is described in an audio encoded bitstream and output.

図１１は、ＱＭＦフィルタを用いて符号化されたオーディオ符号化ビットストリームから広帯域時間周波数信号を復号化する復号化装置１２００の構成を示すブロック図である。復号化装置１２００は、高域部の時間周波数信号を表す拡張時間周波数信号を可変長符号化して得られた符号化列と、低域時間周波数信号を量子化および符号化して得られた符号化列とからなる入力オーディオ符号化ビットストリームから、広帯域時間周波数信号を復号化する復号化装置であって、核復号化部１２０１、拡張復号化部１２０２およびスペクトル加算部１２０３を備える。 FIG. 11 is a block diagram illustrating a configuration of a decoding apparatus 1200 that decodes a wideband time-frequency signal from an audio encoded bitstream encoded using a QMF filter. The decoding apparatus 1200 includes an encoded sequence obtained by variable-length encoding an extended time frequency signal representing a high frequency part time frequency signal, and an encoding obtained by quantizing and encoding the low frequency signal A decoding device that decodes a wideband time-frequency signal from an input audio encoded bitstream including a sequence, and includes a nuclear decoding unit 1201, an extended decoding unit 1202, and a spectrum addition unit 1203.

核復号化部１２０１は、入力されたオーディオ符号化ビットストリームを復号化し、量子化された低域時間周波数信号と、高域時間周波数信号を表す拡張時間周波数信号とを分離する。核復号化部１２０１は、さらに、オーディオ符号化ビットストリームから分離された低域時間周波数信号を、逆量子化してスペクトル加算部１２０３に出力する。スペクトル加算部１２０３は、核復号化部１２０１によって復号化および逆量子化された低域時間周波数信号と、拡張復号化部１２０２によって生成された高域時間周波数信号とを加算して、全再生帯域例えば、再生帯域０ｋＨｚ〜２２．０５ｋＨｚの時間周波数信号を出力する。この出力時間周波数信号は、例えば、後段の図示しないＱＭＦ逆変換フィルタによって時間軸上のオーディオ信号に変換され、さらに後段のスピーカなどにより音声および音楽などの可聴音に変換される。 The nuclear decoding unit 1201 decodes the input audio encoded bitstream, and separates the quantized low frequency signal and the extended time frequency signal representing the high frequency signal. The nuclear decoding unit 1201 further inverse-quantizes the low frequency frequency signal separated from the audio encoded bitstream and outputs the result to the spectrum adding unit 1203. The spectrum adding unit 1203 adds the low frequency time frequency signal decoded and inverse quantized by the nuclear decoding unit 1201 and the high frequency time frequency signal generated by the extended decoding unit 1202, For example, a time frequency signal having a reproduction band of 0 kHz to 22.05 kHz is output. This output time frequency signal is converted into an audio signal on the time axis by, for example, a QMF inverse conversion filter (not shown) in the subsequent stage, and further converted into audible sounds such as voice and music by a speaker in the subsequent stage.

拡張復号化部１２０２は、核復号化部１２０１によって復号化された低域時間周波数信号と、拡張時間周波数信号とを入力とし、分離された拡張時間周波数信号に基づいて、高域時間周波数信号を代替する低域時間周波数信号を特定して高域部にコピーし、さらにその振幅を調整して高域時間周波数信号を生成する処理部であって、さらに、代替制御部１２０４およびゲイン調整部１２０５を備える。 The extended decoding unit 1202 receives the low frequency time frequency signal decoded by the nuclear decoding unit 1201 and the extended time frequency signal, and converts the high frequency time frequency signal based on the separated extended time frequency signal. A processing unit that identifies a low frequency time frequency signal to be substituted and copies it to a high frequency part, further adjusts the amplitude thereof to generate a high frequency time frequency signal, and further includes a substitution control unit 1204 and a gain adjustment unit 1205. Is provided.

代替制御部１２０４は、復号化された拡張時間周波数信号に従って、例えば、１６番目の高域時間周波数信号を代替する０番目〜１５番目の低域時間周波数信号の１つを特定し、特定された低域時間周波数信号を１６番目の高域時間周波数信号としてコピーする。ゲイン調整部１２０５は、高域部に１６番目の高域時間周波数信号としてコピーされた低域時間周波数信号を、拡張時間周波数信号に記述されているゲイン情報に従って増幅し、振幅を調整する。拡張復号化部１２０２は、さらに、代替制御部１２０４とゲイン調整部１２０５とによる上記処理を、１７番目〜３１番目の各高域時間周波数信号についても行う。０番目〜１５番目の低域時間周波数信号の１つを特定するためには４ビット、コピーされた低域時間周波数信号の振幅を調整するためのゲイン情報に４ビットを使用することにすると、１６番目〜３１番目までの高域時間周波数信号は、高々、（４＋４）＊３２＝２５６ビットで表すことができる。 The substitution control unit 1204 identifies, for example, one of the 0th to 15th low-frequency time frequency signals that substitutes for the 16th high-frequency time frequency signal according to the decoded extended time-frequency signal. The low frequency time frequency signal is copied as the 16th high frequency time frequency signal. The gain adjusting unit 1205 amplifies the low-frequency time frequency signal copied as the 16th high-frequency time frequency signal in the high-frequency part according to the gain information described in the extended time frequency signal, and adjusts the amplitude. The extended decoding unit 1202 further performs the above processing by the substitution control unit 1204 and the gain adjustment unit 1205 for each of the 17th to 31st high frequency signals. If 4 bits are used to specify one of the 0th to 15th low frequency time frequency signals, and 4 bits are used for gain information for adjusting the amplitude of the copied low frequency time frequency signal, The 16th to 31st high frequency signals can be represented by (4 + 4) * 32 = 256 bits at most.

図１２は、実施の形態６の復号化装置１２００によって復号化される時間周波数信号の一例を示す図である。例えば、ｋ（ｋは、０≦ｋ≦１５の整数）番目の低域時間周波数信号のスペクトル列を、Ｂｋ＝(pk(t0)，pk(t1)，...，pk(t31))と表すと、図のように、本実施の形態６の図示しない符号化装置によって生成されたオーディオ符号化ビットストリームには、例えば、０番目〜１５番目の低域時間周波数信号Ｂ０〜Ｂ１５が、量子化および符号化されて記述されている。 FIG. 12 is a diagram illustrating an example of a time-frequency signal decoded by the decoding apparatus 1200 according to the sixth embodiment. For example, the spectrum sequence of the kth (k is an integer of 0 ≦ k ≦ 15) -th low-frequency frequency signal is represented by Bk = (pk (t0), pk (t1),..., Pk (t31)). As shown in the figure, in the audio encoded bitstream generated by the encoding apparatus (not shown) of the sixth embodiment, for example, the 0th to 15th low frequency frequency signals B0 to B15 are quantized. And coded and described.

一方、１６番目〜３１番目の高域時間周波数信号Ｂ１６〜Ｂ３１に対しては、それぞれを代替する０番目〜１５番目の低域時間周波数信号Ｂ０〜Ｂ１５の１つを特定する情報と、高域にコピーされたそれぞれの低域時間周波数信号の振幅を調整するためのゲイン情報とが記述されている。例えば、１６番目の高域時間周波数信号Ｂ１６を表すために、拡張時間周波数信号には、１６番目の高域時間周波数信号Ｂ１６を代替する１０番目の低域時間周波数信号Ｂ１０を示す情報と、１６番目の高域時間周波数信号Ｂ１６として高域部にコピーされた低域時間周波数信号Ｂ１０の振幅を調整するためのゲイン情報Ｇ０とが記述される。 On the other hand, for the 16th to 31st high frequency signals B16 to B31, information for specifying one of the 0th to 15th low frequency signals B0 to B15 that substitute each other, and the high frequency And gain information for adjusting the amplitude of each low-frequency time-frequency signal copied to. For example, in order to represent the 16th high frequency time frequency signal B16, the extended time frequency signal includes information indicating the 10th low frequency time frequency signal B10 replacing the 16th high frequency time frequency signal B16, and 16 The gain information G0 for adjusting the amplitude of the low frequency time frequency signal B10 copied to the high frequency part as the first high frequency time frequency signal B16 is described.

これに従って、核復号化部１２０１によって復号化および逆量子化が施された１０番目の低域時間周波数信号Ｂ１０が、１６番目の高域時間周波数信号Ｂ１６として高域部にコピーされ、ゲイン情報Ｇ０の分だけ増幅され、１６番目の高域時間周波数信号Ｂ１６が生成される。１７番目の高域時間周波数信号Ｂ１７についても同様で、拡張時間周波数信号に記述されている１１番目の低域時間周波数信号Ｂ１１が代替制御部１２０４によって１７番目の高域時間周波数信号Ｂ１７としてコピーされ、ゲイン情報Ｇ１で示されるゲインで増幅され、１７番目の高域時間周波数信号Ｂ１７が生成される。これと同様の処理を、１８番目〜３１番目の高域時間周波数信号Ｂ１８〜３１について繰り返すことによって、すべての高域時間周波数信号を得ることができる。 In accordance with this, the tenth low frequency time frequency signal B10 decoded and inversely quantized by the nuclear decoding unit 1201 is copied to the high frequency part as the 16th high frequency time frequency signal B16, and gain information G0 is obtained. And the 16th high frequency frequency signal B16 is generated. The same applies to the 17th high frequency time frequency signal B17. The 11th low frequency time frequency signal B11 described in the extended time frequency signal is copied by the alternative control unit 1204 as the 17th high frequency time frequency signal B17. Amplified with the gain indicated by the gain information G1, the 17th high frequency frequency signal B17 is generated. By repeating the same processing for the 18th to 31st high frequency time frequency signals B18 to 31, all the high frequency time frequency signals can be obtained.

以上のように、本実施の形態６によれば、符号化装置では、ＱＭＦフィルタの出力である時間周波数信号についても、本発明の低域時間周波数信号による高域時間周波数信号の代替を適用して、広帯域なオーディオ時間周波数信号を比較的少ないデータ量の増加だけで符号化することができ、また、復号化装置では高域の豊かなオーディオ信号を復号化することができる。 As described above, according to the sixth embodiment, the encoding apparatus applies the substitution of the high frequency time frequency signal by the low frequency time frequency signal of the present invention to the time frequency signal that is the output of the QMF filter. Thus, it is possible to encode a wideband audio time-frequency signal with only a relatively small increase in the amount of data, and the decoding apparatus can decode a high-frequency rich audio signal.

なお、本実施の形態６では、高域時間周波数信号のそれぞれを低域時間周波数信号のそれぞれが代替すると説明したが、本発明はこれに限定されず、例えば、低域部と高域部とを同数（例えば、４個）の時間周波数信号からなる複数（例えば、８個）のグループに分け、高域の各グループを低域のグループの１つの時間周波数信号で代替するようにしてもよい。 In the sixth embodiment, it has been described that each of the high frequency signals is replaced by each of the low frequency signals. However, the present invention is not limited to this, for example, the low frequency portion and the high frequency portion. May be divided into a plurality of (for example, eight) groups of the same number (for example, four) of time frequency signals, and each of the high frequency groups may be replaced with one time frequency signal of the low frequency group. .

また、３２個のスペクトル値からなるノイズを生成して重畳し、高域にコピーされた低域時間周波数信号の振幅を調整するとしてもよい。また、本実施の形態６では、サンプリング周波数４４．１ｋＨｚ、１フレーム１０２４サンプル、１時間周波数信号を構成するサンプル数２２および１フレームを構成する時間周波数信号３２個として説明したが、本発明はこれに限定されず、サンプリング周波数および１フレームを構成するサンプル数は、他の数値であってもよい。 Alternatively, noise composed of 32 spectral values may be generated and superimposed to adjust the amplitude of the low frequency frequency signal copied to the high frequency. In the sixth embodiment, the sampling frequency is 44.1 kHz, one frame is 1024 samples, the number of samples constituting one time frequency signal is 22 and the time frequency signal is one 32 frames. However, the present invention is not limited to this. However, the present invention is not limited thereto, and the sampling frequency and the number of samples constituting one frame may be other numerical values.

なお、本発明に係る符号化装置は、ＢＳおよびＣＳを含む衛星放送の放送局に備えられる音響符号化装置として、またインターネットなどの通信ネットワークを介してコンテンツを配信するコンテンツ配信サーバの音響符号化装置として、さらに、汎用のコンピュータによって実行される音響信号符号化用のプログラムとして有用である。 The encoding apparatus according to the present invention is an acoustic encoding apparatus provided in a satellite broadcasting station including BS and CS, and an acoustic encoding of a content distribution server that distributes content via a communication network such as the Internet. The apparatus is further useful as a program for encoding an acoustic signal executed by a general-purpose computer.

また、本発明に係る復号化装置は、家庭のＳＴＢに備えられる音響復号化装置としてだけでなく、汎用のコンピュータによって実行される音響信号復号化用のプログラムとして、またＳＴＢまたは汎用のコンピュータに備えられる音響信号復号化用の専用の回路基板、ＬＳＩなどとして、さらにＳＴＢまたは汎用のコンピュータに挿入されるＩＣカードとして有用である。 Moreover, the decoding apparatus according to the present invention is provided not only as an acoustic decoding apparatus provided in a home STB but also as a program for decoding an acoustic signal executed by a general-purpose computer, and provided in an STB or a general-purpose computer. It is useful as a dedicated circuit board for decoding acoustic signals, LSI, etc., and further as an IC card inserted into an STB or general-purpose computer.

本発明の実施の形態１における符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the encoding apparatus in Embodiment 1 of this invention. 図２（ａ）は、ＭＤＣＴ部によって出力されるＭＤＣＴ係数列を示す図である。図２（ｂ）は、図２（ａ）に示したＭＤＣＴ係数のうち、量子化部で符号化される０番目から（maxline−１）番目までのＭＤＣＴ係数を示す図である。図２（ｃ）は、図１に示したＢＷＥエンコード部における拡張オーディオ符号化列の生成方法の一例を示す図である。FIG. 2A is a diagram illustrating an MDCT coefficient sequence output by the MDCT unit. FIG. 2B is a diagram illustrating the 0th to (maxline-1) -th MDCT coefficients encoded by the quantization unit among the MDCT coefficients illustrated in FIG. FIG. 2C is a diagram illustrating an example of a method for generating an extended audio encoded sequence in the BWE encoding unit illustrated in FIG. 図３（ａ）は、原音のＭＤＣＴ係数列を表す波形図である。図３（ｂ）は、ＢＷＥエンコード部による代替によって生成されたＭＤＣＴ係数列を表す波形図である。図３（ｃ）は、図３（ｂ）に示したＭＤＣＴ係数列にゲイン制御を施した場合のＭＤＣＴ係数列を表す波形図である。FIG. 3A is a waveform diagram showing an MDCT coefficient sequence of the original sound. FIG. 3B is a waveform diagram showing an MDCT coefficient sequence generated by substitution by the BWE encoding unit. FIG. 3C is a waveform diagram showing an MDCT coefficient sequence when gain control is performed on the MDCT coefficient sequence shown in FIG. 図４（ａ）は、通常の音響符号化ビットストリームの一例を示す図である。図４（ｂ）は、本実施の形態の符号化装置によって出力される音響符号化ビットストリームの一例を示す図である。図４（ｃ）は、図４（ｂ）に示した拡張オーディオ符号化列部に記述される拡張オーディオ符号化列の一例を示す図である。FIG. 4A is a diagram illustrating an example of a normal audio encoded bitstream. FIG. 4B is a diagram illustrating an example of an acoustic coded bit stream output by the coding apparatus according to the present embodiment. FIG. 4C is a diagram illustrating an example of the extended audio encoded sequence described in the extended audio encoded sequence unit illustrated in FIG. 図１の符号化装置から出力された音響符号化ビットストリームを復号化する復号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the decoding apparatus which decodes the audio | voice coding bit stream output from the encoding apparatus of FIG. 実施の形態２のＢＷＥエンコード部による拡張周波数スペクトル情報生成方法を示す図である。It is a figure which shows the extended frequency spectrum information generation method by the BWE encoding part of Embodiment 2. FIG. 図７（ａ）は、実施の形態２と同様に分割された低域部および高域部のサブバンドを示す図である。図７（ｂ）は、低域サブバンドＡのＭＤＣＴ係数列の一例を示す図である。図７（ｃ）は、低域サブバンドＡのＭＤＣＴ係数列の順序を反転させて得られるサブバンドＡｓのＭＤＣＴ係数列の一例を示す図である。図７（ｄ）は、低域サブバンドＡのＭＤＣＴ係数列の符号を反転させて得られるサブバンドＡｒを示す図である。FIG. 7A is a diagram showing subbands of the low frequency band and the high frequency band divided in the same manner as in the second embodiment. FIG. 7B is a diagram illustrating an example of the MDCT coefficient sequence of the low frequency subband A. FIG. 7C is a diagram illustrating an example of the MDCT coefficient sequence of the subband As obtained by reversing the order of the MDCT coefficient sequence of the low frequency subband A. FIG. 7D is a diagram showing a subband Ar obtained by inverting the sign of the MDCT coefficient sequence of the low frequency subband A. 図８（ａ）は、高域サブバンドｈ０に対して特定された低域サブバンドＡのＭＤＣＴ係数の一例を示す図である。図８（ｂ）は、ノイズ生成部によって生成される低域サブバンドＡと同数のＭＤＣＴ係数の一例を示す図である。図８（ｃ）は、図８（ａ）に示した低域サブバンドＡのＭＤＣＴ係数と、図８（ｂ）に示したノイズ生成部によるＭＤＣＴ係数とを用いて生成される、高域サブバンドｈ０を代替するＭＤＣＴ係数の一例を示す図である。FIG. 8A is a diagram illustrating an example of the MDCT coefficients of the low frequency subband A specified for the high frequency subband h0. FIG. 8B is a diagram illustrating an example of the same number of MDCT coefficients as the low frequency subband A generated by the noise generation unit. FIG. 8C illustrates a high-frequency subband generated using the MDCT coefficient of the low-frequency subband A illustrated in FIG. 8A and the MDCT coefficient generated by the noise generation unit illustrated in FIG. It is a figure which shows an example of the MDCT coefficient which substitutes the band h0. 図９（ａ）は、時刻t0における１フレームのＭＤＣＴ係数を示す図である。図９（ｂ）は、時刻t1における次のフレームのＭＤＣＴ係数を示す図である。図９（ｃ）は、時刻t2におけるさらに次のフレームのＭＤＣＴ係数を示す図である。FIG. 9A shows the MDCT coefficient of one frame at time t0. FIG. 9B shows the MDCT coefficient of the next frame at time t1. FIG. 9C shows the MDCT coefficient of the next frame at time t2. 図１０（ａ）は、時刻t0における１フレームのＭＤＣＴ係数を示す図である。図１０（ｂ）は、時刻t1における次のフレームのＭＤＣＴ係数を示す図である。図１０（ｃ）は、時刻t2におけるさらに次のフレームのＭＤＣＴ係数を示す図である。FIG. 10A shows the MDCT coefficient of one frame at time t0. FIG. 10B shows the MDCT coefficient of the next frame at time t1. FIG. 10C shows the MDCT coefficient of the next frame at time t2. ＱＭＦフィルタを用いて符号化されたオーディオ符号化ビットストリームから広帯域時間周波数信号を復号化する復号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the decoding apparatus which decodes a wideband time frequency signal from the audio encoding bit stream encoded using the QMF filter. 実施の形態６の復号化装置によって復号化される時間周波数信号の一例を示す図である。FIG. 38 is a diagram illustrating an example of a time-frequency signal decoded by the decoding device according to the sixth embodiment. 従来の符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the conventional encoding apparatus.

Explanation of symbols

２００符号化装置
２０１プリプロセス部
２０２ＭＤＣＴ部
２０３量子化部
２０４ＢＷＥエンコード部
２０５符号化列生成部
１２００復号化装置
１２０１核復号化部
１２０２拡張復号化部
１２０３スペクトル加算部 DESCRIPTION OF SYMBOLS 200 Encoding apparatus 201 Preprocessing part 202 MDCT part 203 Quantization part 204 BWE encoding part 205 Encoded sequence generation part 1200 Decoding apparatus 1201 Nuclear decoding part 1202 Extended decoding part 1203 Spectrum addition part

Claims

An encoding device for encoding an input signal,
Time frequency conversion means for converting an input signal on the time axis into a frequency spectrum including a low frequency spectrum;
Band extension means for generating extension information used to identify a high frequency spectrum at a frequency higher than the low frequency spectrum;
Encoding means for encoding the low frequency spectrum and the extension information, and outputting the encoded low frequency spectrum and extension information,
The band extending means includes a first parameter used to determine a partial spectrum to be replicated as the high frequency spectrum from among a plurality of partial spectra constituting the low frequency spectrum, and the partial spectrum after the replication A second parameter used to determine the gain is generated as the extension information;
The band extension means further generates, as the extension information, a noise parameter that specifies energy of a noise spectrum to be added to a high frequency spectrum specified by the first parameter and the second parameter. Encoding device.

The encoding apparatus according to claim 1, wherein the time-frequency conversion unit performs MDCT (modified discrete cosine transform) on an input signal on a time axis into a frequency spectrum including a low-frequency spectrum.

The encoding device according to claim 1, wherein the parameter that specifies the energy of the noise spectrum is an energy ratio of the noise spectrum to the high frequency spectrum.

The encoding device according to any one of claims 1 to 3, wherein the first parameter includes information on whether to use the same extension information as the extension information of the preceding frame.

The encoding device according to claim 4, wherein the first parameter includes information indicating whether or not the same extension information as that of the immediately preceding frame is used.

An encoding method for encoding an input signal, comprising:
A time-frequency conversion step of converting an input signal on the time axis into a frequency spectrum including a low-frequency spectrum;
A band extension step of generating extension information used to identify a high frequency spectrum at a frequency higher than the low frequency spectrum;
An encoding step of encoding the low frequency spectrum and the extension information, and outputting the encoded low frequency spectrum and extension information;
The band extension step includes: a first parameter used to determine a partial spectrum to be replicated as the high frequency spectrum from among a plurality of partial spectra constituting the low frequency spectrum; and A second parameter used to determine the gain is generated as the extension information;
The band extension step further generates, as the extension information, a noise parameter that specifies energy of a noise spectrum to be added to a high frequency spectrum specified by the first parameter and the second parameter. Encoding method.

The encoding method according to claim 6, wherein the time-frequency conversion step performs MDCT (modified discrete cosine transform) on an input signal on a time axis into a frequency spectrum including a low frequency spectrum.

The encoding method according to claim 6 or 7, wherein the parameter specifying the energy of the noise spectrum is an energy ratio of the noise spectrum to the high frequency spectrum.

The encoding method according to any one of claims 6 to 8, wherein the first parameter includes information indicating whether or not the same extension information as that of the preceding frame is used.

The encoding method according to claim 9, wherein the first parameter includes information as to whether or not the same extension information as that of the immediately preceding frame is used.

A program for encoding an input signal,
The program which makes a computer perform the encoding method of any one of Claims 6-10.

A computer-readable recording medium for recording the encoding program according to claim 11.

A decoding device for decoding an encoded signal,
The extension signal is extension information used to specify a low-frequency spectrum and a high-frequency spectrum at a frequency higher than the low-frequency spectrum, and includes a first parameter, a second parameter, and a noise parameter. Including extended information,
The first parameter is used to determine a partial spectrum to be duplicated as the high frequency spectrum from among a plurality of partial spectra constituting the low frequency spectrum, and the second parameter is a partial spectrum after duplication. Used to determine the gain of
The noise parameter specifies an energy of a noise spectrum to be added to the high frequency spectrum specified by the first parameter and the second parameter;
The decoding device
Decoding means for generating the low frequency spectrum and the extension information by decoding the encoded signal;
A high-frequency spectrum that generates the high-frequency spectrum based on the low-frequency spectrum and the extension information, and adds a noise spectrum having energy specified by the noise parameter to the generated high-frequency spectrum. Generating means;
A decoding apparatus comprising: a time-frequency conversion unit that converts a frequency spectrum obtained by synthesizing the generated high-frequency spectrum and the low-frequency spectrum into a signal on a time axis.

The time frequency conversion means performs MDCT (Modified Discrete Cosine Transform) on a time axis signal to a frequency spectrum obtained by synthesizing the generated high frequency spectrum and the low frequency spectrum. The decoding device according to claim 13.

The decoding apparatus according to claim 13 or 14, wherein the noise parameter for specifying the energy of the noise spectrum is an energy ratio of the noise spectrum to the high frequency spectrum.

The first parameter includes information on whether to use the same extension information as the extension information of the preceding frame,
The decoding apparatus according to any one of claims 13 to 15, wherein the high-frequency spectrum generation unit generates the high-frequency spectrum using the information.

The decoding apparatus according to claim 16, wherein the first parameter includes information indicating whether or not the same extension information as that of the immediately preceding frame is used.

A decoding method for decoding an encoded signal, comprising:
The extension signal is extension information used to specify a low-frequency spectrum and a high-frequency spectrum at a frequency higher than the low-frequency spectrum, and includes a first parameter, a second parameter, and a noise parameter. Including extended information,
The first parameter is used to determine a partial spectrum to be duplicated as the high frequency spectrum from among a plurality of partial spectra constituting the low frequency spectrum, and the second parameter is a partial spectrum after duplication. Wherein the noise parameter is used to specify energy of a noise spectrum added to the high frequency spectrum specified by the first parameter and the second parameter;
The decoding method is:
A decoding step of generating the low frequency spectrum and the extension information by decoding the encoded signal;
A high-frequency spectrum that generates the high-frequency spectrum based on the low-frequency spectrum and the extension information, and adds a noise spectrum having energy specified by the noise parameter to the generated high-frequency spectrum. Generation step;
A decoding method comprising: a time frequency conversion step of converting a frequency spectrum obtained by combining the generated high frequency spectrum and the low frequency spectrum into a signal on a time axis.

In the time frequency conversion step, a frequency spectrum obtained by synthesizing the generated high frequency spectrum and the low frequency spectrum is subjected to MDCT (modified discrete cosine transform) to a signal on a time axis. The decoding method according to claim 18.

The decoding apparatus according to claim 18 or 19, wherein the noise parameter for specifying the energy of the noise spectrum is an energy ratio of the noise spectrum to the high frequency spectrum.

The first parameter includes information on whether to use the same extension information as the extension information of the preceding frame,
The decoding method according to any one of claims 18 to 20, wherein, in the high frequency spectrum generation step, the high frequency spectrum is generated using the information.

The decoding method according to claim 21, wherein the first parameter includes information on whether to use the same extension information as that of the immediately preceding frame.

A program for decoding an encoded signal,
A program for causing a computer to execute the encoding method according to any one of claims 18 to 22.

A computer-readable recording medium for recording the decoding program according to claim 23.