JP4317355B2

JP4317355B2 - Encoding apparatus, encoding method, decoding apparatus, decoding method, and acoustic data distribution system

Info

Publication number: JP4317355B2
Application number: JP2002313216A
Authority: JP
Inventors: 孝祐西尾; 武志則松; 峰生津島; 直也田中
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2001-11-30
Filing date: 2002-10-28
Publication date: 2009-08-19
Anticipated expiration: 2022-10-28
Also published as: JP2003228399A

Abstract

<P>PROBLEM TO BE SOLVED: To provide an encoding device which can efficiently encode a sound signal over a wide band. <P>SOLUTION: A sound data input part 310 of the encoding device 300 cuts 4096 pieces of successive sound data out of a sound data sequence and a conversion part 320 converts the cut sound data into spectrum data on the frequency axis. A data separation part 330 separates the spectrum data into a low-frequency part and a high-frequency part based upon 11.025 kHz (f1) as a border. The low-frequency part spectrum data are quantized and encoded by a 1st quantization part 340 and an encoding part 350 as usual. A 2nd encoding part 355 generates auxiliary information showing a feature of a high-frequency part frequency spectrum and a 2nd encoding part 355 encodes the auxiliary information. A stream output part 390 puts together and output the code obtained by the 1st encoding part 350 and the code obtained by the 2nd encoding part 355. Here, f1 is less than a half of the sampling frequency f2 when the sound data sequence was generated. <P>COPYRIGHT: (C)2003,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、音響信号の高音質圧縮符号化および伸張復号化技術に関する。
【０００２】
【従来の技術】
近年、音声や楽音などの音響信号を高音質圧縮符号化および伸張復号する様々な方式の技術が開発されており、「ＭＰＥＧ−２ＡｄｖａｎｃｅｄＡｕｄｉｏＣｏｄｉｎｇ」（以下、「ＭＰＥＧ−２ＡＡＣ」あるいは「ＡＡＣ」と略称する）」もその方式の１つである（非特許文献１参照。）。
【０００３】
【非特許文献１】
Ｍ．Ｂｏｓｉ他著、「ＩＳ１３８１８−７（ＭＰＥＧ−２ＡｄｖａｎｃｅｄＡｕｄｉｏＣｏｄｉｎｇ，ＡＡＣ）」、１９９７年４月
図２４は、従来のＡＡＣ方式による符号化装置および復号化装置の機能構成を示すブロック図である。
【０００４】
符号化装置１０００は、入力された音響信号をＡＡＣ符号化方式に基づいて圧縮符号化する装置であって、Ａ／Ｄ変換器１０５０、音響データ入力部１１００、変換部１２００、量子化部１４００、符号化部１５００およびストリーム出力部１９００から構成される。
【０００５】
Ａ／Ｄ変換器１０５０は、入力された音響信号を、例えば２２．０５ｋＨｚのサンプリング周波数でサンプリングし、アナログの音響信号からデジタルの音響データ列に変換する。音響データ入力部１１００は、入力信号である音響データ列を１０２４サンプル（この１０２４サンプルを以下、「フレーム」という。）読み込むごとに、そのフレームと、そのフレームの前後に隣接するフレームの５０％のサンプル（５１２）をオーバーラップさせた合計２０４８サンプルの音響データ列を切り出す。変換部１２００は、ＭＤＣＴ（ＭｏｄｉｆｉｅｄＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）によって、音響データ入力部１１００によって切り出された時間軸上の２０４８サンプルのデータを周波数軸上のスペクトルデータに変換する。なお、変換によって得られたデータの内の半分である１０２４点のスペクトルデータは、サンプリング周波数の半分１１．０２５ｋＨｚ以下の再生帯域を表しており、複数のグループに分類される。各グループは、複数のグループのそれぞれに、１点以上のスペクトルデータが含まれるように設定される。また、この各グループは、人間の聴覚におけるクリティカルバンドを擬似している。各グループのそれぞれを「スケールファクターバンド」という。量子化部１４００は、スケールファクターバンドごとに１つの正規化係数を用いて、変換部１２００から得られたスケールファクターバンド内のスペクトルデータを所定ビット数に量子化する。この正規化係数のことを「スケールファクター」という。また、各スペクトルデータを各スケールファクターで量子化した結果を「量子化値」という。符号化部１５００は、量子化部１４００で量子化されたデータ、すなわち、各スケールファクターと、それを用いて量子化されたスペクトルデータとをハフマン符号化する。ストリーム出力部１９００は、符号化部１５００から得られた符号化信号を、ＡＡＣビットストリームのストリームフォーマットに変換し、出力する。符号化装置１０００から出力されたビットストリームは、伝送媒体や記録媒体を介して復号化装置２０００に伝達される。
【０００６】
復号化装置２０００は、符号化装置１０００によって符号化されたビットストリームを復号化する装置であって、ストリーム入力部２１００、復号化部２２００、逆量子化部２３００、逆変換部２８００、音響データ出力部２９００およびＤ／Ａ変換器２９５０から構成される。
【０００７】
ストリーム入力部２１００は、符号化装置１０００によって符号化されたビットストリームを伝送媒体や記録媒体を介して入力し、入力したビットストリームから符号化信号を取り出す。復号化部２２００は、ハフマン符号化された符号化信号を量子化データに復号化する。逆量子化部２３００は、復号化部２２００で復号化された量子化データをスケールファクターを用いて逆量子化する。逆変換部２８００は、ＩＭＤＣＴ（ＩｎｖｅｒｓｅＭｏｄｉｆｉｅｄＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）を用いて、逆量子化部２３００で得られた周波数軸上の１０２４点のスペクトルデータを、時間軸上の１０２４サンプルの音響データに変換する。音響データ出力部２９００は、逆変換部２８００で得られた時間軸上の１０２４サンプルの音響データを順次組み合わせ、１０２４サンプルの音響データを時間順に１つずつ出力する。Ｄ／Ａ変換器２９５０は、２２．０５ｋＨｚのサンプリング周波数でデジタルの音響データからアナログの音響信号に変換する。
【０００８】
このような従来のＡＡＣ規格にしたがう符号化装置１０００および復号化装置２０００によれば、各点を１ビット以下にまで圧縮率を高めることができ、しかも、サンプリング周波数の半分１１．０２５ｋＨｚ以下の再生帯域を表し、聴覚的に優先度の高い低域部１０２４点のスペクトルデータを符号化しているので、比較的高音質で音響信号を再生することができる。
【０００９】
【発明が解決しようとする課題】
しかしながら、従来のＡＡＣ方式の符号化装置１０００および復号化装置２０００（従来例１）によれば、サンプリング周波数が２２．０５ｋＨｚであるため、符号化されるスペクトルデータに１１．０２５ｋＨｚを超える帯域が全く含まれていない。このため、１１．０２５ｋＨｚを超える帯域も視聴したいというさらなる高品質化の要望に応えることができないという問題がある。
【００１０】
このような問題を解決するために、図２４の符号化装置１０００および復号化装置２０００のＡ／Ｄ変換器１０５０、Ｄ／Ａ変換器２９５０に印可するサンプリング周波数を２２．０５ｋＨｚの２倍、４４．１ｋＨｚに高めることが考えられる（この方式を以下、「従来例２」とも記す。）。
【００１１】
しかしながら、サンプリング周波数を４４．１ｋＨｚにすると、圧縮率を維持しつつ、１１．０２５ｋＨｚを超える高域部に５１２点のスペクトルデータを符号化することができるものの、聴覚的に優先度の高い低域部のスペクトルデータが５１２点に半減してしまう。すなわち、サンプリング周波数と低域部のスペクトル数とはトレード・オフの関係にあり、従来のＡＡＣでは両方を同時に高めることはできない。このため、全体としての音質がかえって劣化してしまうという別の問題が発生する。
【００１２】
このような事情は、他の方式（例えば、ＭＰ３、ＡＣ３等）の符号化装置および復号化装置等においても同様である。
本発明は上述の技術的課題を解決するためになされたものであり、符号化後の情報量を大幅に増加させることなく、音響信号の高音質な再生を実現しうる符号化装置および復号化装置等を提供することである。
【００１３】
【課題を解決するための手段】
上記課題を解決するために、本発明に係る符号化装置は、音響データを符号化する符号化装置であって、音響データ列から、連続する一定個数の音響データを切り出す切り出し手段と、切り出された音響データを周波数軸上のスペクトルデータに変換する変換手段と、前記変換で得られたスペクトルデータを前記ｆ１Ｈｚまでの低域部スペクトルデータと前記ｆ１Ｈｚよりも高い高域部スペクトルデータとに分離する分離手段と、分離された低域部スペクトルデータを量子化し、符号化する低域部符号化手段と、分離された高域部スペクトルデータから、当該高域部周波数スペクトルの特徴を示す補助情報を生成する補助情報生成手段と、生成された補助情報を符号化する高域部符号化手段と、前記低域部符号化手段で得られた符号と前記高域部符号化手段で得られた符号とを合成して出力する出力手段とを備え、前記ｆ１は、前記音響データ列が作成されたときのサンプリング周波数ｆ２の半分以下であり、前記補助情報生成手段は、それぞれ複数のグループに分けられた前記高域部スペクトルデータ及び前記低域部スペクトルーデータにつき、高域部の各グループにおいて、当該グループ内のスペクトルと最も近似するスペクトルを持つ低域部スペクトルデータのグループを特定する情報を前記補助情報として生成することを特徴とする。
【００１４】
ここで、前記ｆ１は、ｆ２／４であり、前記変換手段は、前記音響データを０〜２×ｆ１Ｈｚのスペクトルデータに変換し、前記分離手段は、０〜ｆ１Ｈｚの低域部スペクトルデータとｆ１〜２×ｆ１Ｈｚの高域部スペクトルデータとに分離することを特徴としたり、前記ｆ１Ｈｚまでの低域部スペクトルデータは、ｎ個のスペクトルデータから構成され、前記切り出し手段は、２×ｎ個のスペクトルデータを生成するのに必要な個数の音響データを切り出し、前記変換手段は、切り出された音響データを２×ｎ個のスペクトルデータに変換し、前記分離手段は、ｎ個の低域部スペクトルデータとｎ個の高域部スペクトルデータとに分離することを特徴としたり、前記切り出し手段は、符号化の単位である１フレームに相当するｎ個の音響データと、そのフレームに隣接する２つのフレームそれぞれに属するｎ／２個ずつの音響データとを併せた２×ｎ個のスペクトルデータを切り出し、前記変換手段は、切り出された２×ｎ個の音響データに対してＭＤＣＴによって前記変換を行い、２×ｎ個のスペクトルデータからなる０〜２×ｆ１Ｈｚのスペクトルに変換することを特徴としたりする構成としてもよい。
【００１５】
さらに、本発明に係る復号化装置は、記録媒体または伝送媒体を介して入力された符号化データを復号化する復号化装置であって、符号化データに含まれる低域部符号化データと高域部符号化データとをそれぞれ抽出する抽出手段と、前記抽出手段により抽出された低域部符号化データを復号化し、逆量子化することにより、周波数ｆ１以下の低域部のスペクトルデータを出力する低域部逆量子化手段と、前記抽出手段により抽出された高域部データを復号化することにより、高域部スペクトルデータの特徴を表す補助情報を生成する補助情報復号化手段と、前記補助情報復号化手段により生成された補助情報に基づいて高域部のスペクトルデータを出力する高域部逆量子化手段と、前記低域部逆量子化手段によって出力された低域部スペクトルデータと、前記高域部逆量子化手段によって出力された高域部スペクトルデータとを合成する合成手段と、前記合成手段により合成されたスペクトルデータを時間軸上の音響データに逆変換する逆変換手段と、前記逆変換手段により逆変換された音響データを時間順に出力する音響データ出力手段とを備え、前記補助情報は、それぞれ複数のグループに分けられた前記高域部スペクトルデータ及び前記低域部スペクトルデータにつき、高域部の各グループにおいて、当該グループ内のスペクトルと最も近似するスペクトルを持つ低域部スペクトルデータのグループを特定する情報であり、前記高域部逆量子化手段は、前記補助情報に基づいて、高域部の各グループにおいて所定のノイズを生成し、前記スペクトルデータに加算して高域部スペクトルデータを生成することを特徴とする。
【００１６】
なお、本発明は、上記符号化装置と復号化装置とからなる通信システムとして実現したり、上記符号化装置、復号化装置および通信システムを構成する特徴的な手段をステップとする符号化方法、復号化方法、通信方法として実現したり、上記符号化装置、復号化装置を構成する特徴的な手段やステップをＣＰＵに実行させる符号化プログラム、復号化プログラムとして実現したり、これらプログラムが記録されたコンピュータ読み取り可能な記録媒体として実現したりすることができるのはいうまでもない。
【００１７】
【発明の実施の形態】
以下、本発明の実施の形態を音響データ配信システムとしての放送システムに適用した場合について、図面を用いて詳細に説明する。
【００１８】
図１は、本実施の形態に係る放送システムの機能構成を示すブロック図である。
同図に示される本実施の形態に係る放送システム１は、放送局に設けられ、入力された音響信号を符号化する符号化装置３００と、ユーザの端末に設けられ、符号化装置３００により符号化されたビットストリーム音響信号を復号化する復号化装置４００とを備える。
【００１９】
（符号化装置３００）
符号化装置３００は、入力された音響信号を符号化する符号化装置であって、Ａ／Ｄ変換器３０５と、音響データ入力部３１０と、変換部３２０と、データ分離部３３０と、第１および第２の量子化部３４０，３４５と、第１および第２の符号化部３５０，３５５と、ストリーム出力部３９０とを備える。
【００２０】
Ａ／Ｄ変換器３０５は、入力された音響信号を、例えば従来例１の２倍のサンプリング周波数４４．１ｋＨｚでサンプリングし、アナログの音響信号からデジタルの音響データ（例えば、１６ビット）に変換し、時間軸上の音響データ列を生成する。
【００２１】
音響データ入力部３１０は、Ａ／Ｄ変換器３０５によって生成された音響データ列を２０４８サンプル（２フレーム）受け取るごとのサイクル（約４５．４ｍｓｅｃ）、すなわち従来例２の２倍に時間伸張されたゆっくりとしたサイクルで、この２フレームの２０４８サンプルと、これらのフレームの前後に隣接するフレームの５０％、１０２４サンプルをオーバーラップさせた音響データ列、すなわち従来の２倍の個数（４０９６サンプル）の音響データ列を切り出すものであり、２０４８サンプル受け取るごとの切り出しタイミングを検出するためのカウンタ３１１と、４０９６サンプルの音響データ列を一時的に格納するＦＩＦＯバッファ３１２とを備える。
【００２２】
変換部３２０は、音響データ入力部３１０によって切り出された２フレーム分、４０９６サンプルの音響データを、周波数軸上のスペクトルデータに変換するものであり、時間軸上の４０９６サンプルの音響データを４０９６点の周波数軸上のスペクトルデータに変換するＭＤＣＴ３２１と、スペクトルデータをスケールファクターバンドでグループ分けするグループ分け部３２２とを備える。
【００２３】
ＭＤＣＴ３２１は、具体的には、４０９６サンプルの時間軸上の音響データを４０９６点のスペクトルデータ（１６ビット）に変換するが、左右対称な２群のスペクトルデータとなるため、一方の２０４８サンプルのスペクトルデータのみ符号化の対象とし、他方を廃棄する。
【００２４】
このように、符号化装置３００のＡ／Ｄ変換器３０５、音響データ入力部３１０および変換部３２０の構成を上記した従来例１の符号化装置１０００のものと比較すると、Ａ／Ｄ変換器３０５におけるサンプリング周波数を２倍の周波数（４４．１ｋＨｚ）に引き上げ、音響データ入力部３１０における切り出し長を２倍（４０９６サンプル）に引き上げ、変換部３２０のＭＤＣＴ３２１における符号化単位を２倍（４０９６サンプル）に引き上げている点で、大きく異なっている。
【００２５】
また、上記した従来例２の符号化装置１０００の構成と比較すると、Ａ／Ｄ変換器３０５におけるサンプリング周波数が同一であるものの、音響データ入力部３１０における切り出し長を２倍（４０９６サンプル）に引き上げ、変換部３２０のＭＤＣＴ３２１における符号化単位を２倍（符号化単位４０９６サンプル）に引き上げている点で、大きく異なっている。
【００２６】
この結果、変換部３２０からは、１１．０２５ｋＨｚ以下の低域部に属するスペクトルデータ（以下、「低域部のスペクトルデータ」と記す。）が１０２４個、１１．０２５ｋＨｚを超える高域部に属するスペクトルデータ（以下、「高域部のスペクトルデータ」と記す。）が１０２４個、合計２０４８個のスペクトルデータが出力されることになる。
【００２７】
変換部３２０のグループ分け部３２２は、符号化の対象とされた一方の２０４８サンプルのスペクトルデータを、それぞれ１サンプル以上（実用的には４の倍数）のスペクトルデータを含む複数のスケールファクターバンドに分類する。
【００２８】
このスケールファクターバンドは、このＡＡＣ規格において、各スケールファクターバンドに含まれるサンプル（スペクトルデータ）数が周波数に応じて定められており、低域部においては少数のサンプルごとに細かく区切られ、高域部になるほど多数のサンプルを含むよう大きく区切られている。また、ＡＡＣにおいては、１フレームのスペクトルデータに対応するスケールファクターバンドの数もサンプリング周波数に応じて定められている。例えば、サンプリング周波数が４４．１ｋＨｚの場合は、１フレームに含まれるスケールファクターバンドの数は４９個であり、４９個のスケールファクターバンドの中に１０２４サンプルのスペクトルデータが含まれている。一方、このように定められたスケールファクターバンドのうち、どのスケールファクターバンドを伝送するかはＡＡＣで特に規定されておらず、伝送路の転送レートに応じて、最も好ましいスケールファクターバンドを選択して伝送すればよい。例えば、伝送路の転送レートが９６ｋｂｐｓの場合、１フレームのうちの低域部４０スケールファクターバンド（６４０サンプル）のみを選択して伝送するようにしてもよい。
【００２９】
これに対して、本実施の形態においては、２フレームのスペクトルデータ（低域部のスペクトルデータが１０２４個、高域部のスペクトルデータが１０２４個）が従来の２倍のサイクル（約４５．４ｍｓｅｃ）でＭＤＣＴ３２１から出力される。このため、伝送路の転送レートが９６ｋｂｐｓの場合、２フレームのうちの低域部のスケールファクターバンド（１０２４サンプル）をすべて伝送対象としても、従来方式による２フレーム分のデータ転送と比べると、ＡＡＣの２倍（６４０×２＝１２８０サンプル）に対して、転送レートに十分な余裕が生まれる。したがって、本実施の形態においては、グループ分け部３２２が、変換後のスペクトルデータを、独自に定めた区切り方および数のスケールファクターバンドに分類した場合について説明する。
【００３０】
データ分離部３３０は、変換部３２０から出力された２０４８個のスペクトルデータを低域部のスペクトルデータ（１０２４個）と、高域部のスペクトルデータ（１０２４個）とに分離する。そして、データ分離部３３０は、分離した低域部のスペクトルデータ（１０２４個）を第１の量子化部３４０に、高帯域のスペクトルデータ（１０２４個）を第２の量子化部３４５に、それぞれ出力する。
【００３１】
第１の量子化部３４０は、データ分離部３３０から転送された低域部のスペクトルデータをスケールファクターバンドごとに、スケールファクターをそれぞれ決定し、決定したスケールファクターでそのスケールファクターバンド内のスペクトルを量子化し、量子化結果である量子化値と、決定したスケールファクターの先頭と、スケールファクターの差分とを第１の符号化部３５０に出力するものであり、スケールファクター計算部３４１を備える。スケールファクター計算部３４１は、例えば、公式にしたがって、スケールファクターバンドごとにそのバンド内のスペクトルデータが所定のビット数に収まるように１つの正規化係数（スケールファクター、８ビット）を計算し、そのスケールファクターを用いてスケールファクターバンド内の各スペクトルを量子化するとともに、スケールファクタの差分を計算する。
【００３２】
第１の符号化部３５０は、第１の量子化部３４０により量子化されたデータと、スケールファクターバンドごとのスケールファクター等とを所定のストリーム用のフォーマットに符号化するものであり、量子化された各データと各スケールファクター等とをさらに圧縮するためのハフマン符号化テーブル３５１を備える。具体的には、ハフマン符号化テーブル３５１を用いて量子化された各データと、各スケールファクター等とが低ビットレートで伝送されるようにハフマン符号化する。
【００３３】
第２の量子化部３４５は、第１の量子化部３４０において量子化されない帯域、すなわちデータ分離部３３０が出力した１１．０２５ｋＨｚ以上の高域部のスペクトルデータに基づいて、補助情報を計算して出力するものであり、補助情報を生成するための補助情報生成部３４６を備える。ここで、補助情報とは、高域部のスペクトルデータに基づいて計算され、高域部のスペクトルデータの特徴をわずかな情報量で簡潔に表す簡略化された情報をいう。つまり、一定時間分の入力音響データを変換して得られるスペクトルデータのうち、周波数の高域部の特徴を表す情報であって、具体的な一例は、高域部のスケールファクターバンド内で絶対最大スペクトルデータ（絶対値が最大となるスペクトルデータ）の量子化値を「１」にするような、スケールファクターバンドごとの最適なスケールファクターおよびその量子化値である。
【００３４】
第２の符号化部３５５は、第２の量子化部３４５が出力した補助情報を所定のストリーム用のフォーマットに符号化し、第２の符号化情報として出力するものであり、補助情報符号化のためのハフマン符号化テーブル３５６を備える。
【００３５】
ストリーム出力部３９０は、第１の符号化部３５０から出力される第１の符号化信号にヘッダ情報およびその他必要に応じた副情報を付加して従来どおり、ＡＡＣの符号化ビットストリームに変換し、かつ第２の符号化部３５５から出力された第２の符号化信号を、上記ビットストリーム中の従来の復号化装置では無視されるまたはその動作が規定されていない領域に格納する。具体的には、ストリーム出力部３９０は、第２の符号化部３５５から出力される符号化信号を、ＡＡＣの符号化ビットストリームにおけるＦｉｌｌＥｌｅｍｅｎｔやＤａｔａＳｔｒｅａｍＥｌｅｍｅｎｔ等に格納する。
【００３６】
ただし、ヘッダ情報に格納されるビットストリームのサンプルリング周波数を示す情報は、音響データのサンプリング周波数の半分の値を格納しておく。すなわち、音響データのサンプリング周波数が４４．１ｋＨｚの場合、ヘッダには実際の半分２２．０５ｋＨｚという情報を格納しておく。そして、実際のサンプルリング周波数４４．１ｋＨｚを示す情報は、上記補助情報が格納される領域等に格納すればよい。
【００３７】
符号化装置３００から出力されたビットストリームは、電波や光あるいはメタル等によって構成されるインターネットなどの伝送媒体を介して復号化装置４００に伝送される。
【００３８】
このように、符号化装置３００は、変換部３２０により得られた周波数軸上のスペクトルデータを量子化および符号化する際に、データ分離部３３０により低域部のスペクトルデータ（１０２４点）と、高域部のスペクトルデータ（１０２４点）とに分離し、低域部のスペクトルデータについては従来と同方式で量子化および符号化し、高域部のスペクトルデータについては従来とは異なる方式で量子化および符号化し（補助情報を生成し、補助情報を符号化し）、低域部の符号化ビットストリームに高域部の符号化ビットストリームを組み込んで出力する点が、スペクトルデータの量子化、符号化を全帯域にわたり同じ方式で行う従来の符号化装置１０００と、大きく異なっている。
【００３９】
この結果、情報量の合計が、従来と比べて大幅に増加しない範囲で高品質の音響信号を符号化することができる。
また、ヘッダにサンプリング周波数２２．０５ｋＨｚという情報が格納されているので、本実施の形態における符号化装置３００が生成したビットストリームを、従来の復号化装置２０００でも復号することができるという効果がある。
【００４０】
（復号化装置４００）
本実施の形態に係る復号化装置４００は、符号化装置３００から出力されたビットストリームを符号化装置３００とほぼ逆の処理を行って時間軸上の音響信号（再生上限周波数２２．０５ｋＨｚ）を再生するものであって、ストリーム入力部４１０と、第１および第２の復号化部４２０，４２５と、第１および第２の逆量子化部４３０，４３５と、逆量子化データ合成部４４０と、逆変換部４８０と、音響データ出力部４９０と、Ｄ／Ａ変換器４９５とを備える。
【００４１】
ストリーム入力部４１０は、符号化装置３００によって符号化されたビットストリームを伝送媒体を介して入力し、入力したビットストリームから、従来の復号化装置が使用する領域に格納されている第１の符号化信号と、従来の復号化装置が無視または動作が規定されていない領域に格納されている第２の符号化信号とをそれぞれ取り出し、第１の復号化部４２０と第２の復号化部４２５とにそれぞれ出力する。
【００４２】
第１の復号化部４２０は、ストリーム入力部４１０が出力する第１の符号化信号を入力し、ストリーム用のフォーマットから量子化データに復号化するものであって、復号化するためのハフマン復号化テーブル４２１を備える。
【００４３】
第１の逆量子化部４３０は、第１の復号化部４２０で復号された量子化データを逆量子化し、スペクトルデータを出力するものであって、公式に基づいて量子化データを逆量子化するための処理部４３１を備える。ここで、第１の逆量子化部４３０が出力するスペクトルデータのサンプル数は１０２４であり、これらは１１．０２５ｋＨｚ以下の再生帯域を表す。
【００４４】
第２の復号化部４２５は、ストリーム入力部４１０の出力する第２の符号化信号を入力し、補助情報を復号するものであって、補助情報を復号するためのハフマン復号化テーブル４２６を備える。
【００４５】
第２の逆量子化部４３５は、補助情報に基づいて、高域部のスペクトルデータを生成するものであって、スペクトルデータ生成部４３６を備える。ここで、第２の逆量子化部４３５が出力するスペクトルデータのサンプル数は１０２４であり、これらは１１．０２５ｋＨｚを超える再生帯域を表す。
【００４６】
スペクトルデータ生成部４３６は、第１の逆量子化部４３０から出力されたスペクトルデータをもとにあらかじめ決められた手順でノイズを生成し、第２の復号化部４２５の出力する補助情報をもとに上記ノイズを整形して、高域部のスペクトルデータを出力する。このノイズには、ホワイトノイズや、ピンクノイズなどの他、低域部スペクトルデータを一部または全部コピーしたスペクトルデータが含まれる。
【００４７】
具体的には、スペクトルデータ生成部４３６は、例えば、第１の逆量子化部４３０によって出力される低域部のスペクトルデータを高域部にコピーしておき、高域部のスケールファクターバンドごとに、バンド内にコピーされたスペクトルデータの絶対最大値と、量子化値「１」を補助情報に記述されているそのバンドに対応するスケールファクター値を用いて逆量子化した値との比率を係数として、バンド内の各スペクトルデータに乗じることによって高域部のスペクトルを復元する。
【００４８】
逆量子化データ合成部４４０は、第１の逆量子化部４３０の出力するスペクトルデータと第２の逆量子化部４３５の出力するスペクトルデータを合成するものである。ここで、逆量子化データ合成部４４０が出力するスペクトルデータのサンプル数は２０４８であり、０〜２２．０５ｋＨｚの再生帯域を表す。
【００４９】
このように、この復号化装置４００は、符号化装置３００によって符号化されたビットストリームから、従来の復号化装置が使用する領域に格納されている第１（低域部）の符号化信号と、従来の復号化装置が無視または動作が規定されていない領域に格納されている第２（高域部）の符号化信号とをそれぞれ分離し、第１（低域部）の符号化信号だけを従来と同方式で復号化および逆量子化し、第２（高域部）の符号化信号については、従来とは異なる方式で復号化および逆量子化し、低域部のスペクトルデータと高域部のスペクトルデータとを合成して出力する点で、ビットストリームの復号化、逆量子化を全帯域にわたり同じ方式で行う従来例１，２の復号化装置２０００と、その構成が大きく異なっている。
【００５０】
この結果、従来とほぼ同じわずかな情報量から、従来と比べて大幅に増加した情報量を復号化することができ、高音質の音響信号の復号化が可能になる。
逆変換部４８０は、逆量子化データ合成部４４０より出力された周波数軸上のスペクトルデータを、ＩＭＤＣＴを用いて２０４８サンプル（２フレーム）の時間軸上の音響データに変換する。
【００５１】
音響データ出力部４９０は、逆変換部４８０で得られた時間軸上の２０４８サンプルの音響データを順次組み合わせ、２０４８サンプルの音響データを時間順に１つずつ出力する。
【００５２】
Ｄ／Ａ変換器４９５は、４４．１ｋＨｚのサンプリング周波数でデジタルの音響データからアナログの音響信号に変換する。
このように、復号化装置４００は、逆変換部４８０における逆変換単位が２倍（２０４８サンプル）に引き上げられ、音響データ出力部４９０におけるフレーム長が２倍（２０４８サンプル）にそれぞれ引き上げられ、Ｄ／Ａ変換器４９５におけるサンプリング周波数が２倍の周波数（４４．１ｋＨｚ）に引き上げられている点で、上記した従来例１の復号化装置２０００とその構成が大きく異なっている。
【００５３】
この結果、Ｄ／Ａ変換器４９５からは、１１．０２５ｋＨｚ以下の低域部のスペクトルデータ（１０２４個）と、高域部のスペクトルデータ（１０２４個）とに基づく、高帯域（０〜２２．０５ｋＨｚ）で、しかも高品質の音響信号が出力されることになる。
【００５４】
以上のように本実施の形態の機能構成によれば、従来と比べて大幅に増加しない範囲の情報量に基づきながら、低域部は従来の符号化を行い、高域部を極めて少ない情報量で符号化を行うことにより、高品質の音響信号を符号化および復号化することができる。
【００５５】
また、本実施の形態における符号化装置３００および復号化装置４００の構成は、従来の符号化装置１０００にデータ分離部３３０、第２の量子化部３４５および第２の符号化部３５５を追加し、かつ従来の復号化装置２０００に第２の復号化部４２５、第２の逆量子化部４３５および逆量子化データ合成部４４０を追加しただけであるため、既存の符号化装置１０００および復号化装置２０００の構成を大幅に変更することなく実現できるという効果がある。
【００５６】
また、本実施の形態のおける符号化装置３００が生成したビットストリームは、従来の復号化装置２０００でも復号することができるという効果がある。
次いで、放送システム１における符号化装置３００の各部の符号化処理を具体的に説明する。
【００５７】
図２は、図１に示した符号化装置３００の音響データ入力部３１０および変換部３２０において処理される音響信号の状態変化を示す図である。特に、図２（ａ）は図１に示した音響データ入力部３１０によって切り出される時間軸上の２０４８のサンプルデータを示す波形図であり、図２（ｂ）は時間軸上のサンプルデータが図１に示した変換部３２０のＭＤＣＴ３２１によって変換された後の周波数軸上のスペクトルデータを示す波形図である。なお、図２（ａ）および図２（ｂ）において、サンプルデータおよびスペクトルデータはアナログ波形で示されているが、実際には、いずれもデジタル信号である。以下の波形図においても同様である。
【００５８】
音響データ入力部３１０には、４４．１ｋＨｚでサンプリングされた音響データが入力される。音響データ入力部３１０は、この音響データが２０４８サンプル入力されるごとのタイミングでその前後１０２４サンプルをオーバーラップさせて切り出し、変換部３２０に出力する。
【００５９】
変換部３２０は、合計４０９６サンプルのデータをＭＤＣＴするが、ＭＤＣＴによって得られるスペクトルが左右対称の波形となるため、その半分の２０４８サンプルに対応する図２（ｂ）に示すようなスペクトルデータを出力する。
【００６０】
図２（ｂ）に示すスペクトルデータは、縦軸に、周波数スペクトルの値、すなわち、図２（ａ）において２０４８サンプルの電圧値で表されていた音響データの周波数成分の量（大きさ）を、前記サンプル数に対応する２０４８点で表している。また、符号化装置３００に入力される音響信号をサンプリング周波数４４．１ｋＨｚでＡ／Ｄ変換しているので、スペクトルデータの再生帯域は、２２．０５ｋＨｚとなっている。さらに、ＭＤＣＴ３２１によって得られるスペクトルは図２（ｂ）に示すように負の値をとる場合があるので、ＭＤＣＴ３２１によって得られたスペクトルを符号化する場合には、スペクトルの正負の符号も合わせて符号化する必要がある。以下では、符号化の符号との混同を避けるため、スペクトルデータの正負の符号を表す情報を「サイン情報」という。
【００６１】
変換部３２０から出力されたスペクトルデータおよびサイン情報は、データ分離部３３０において、０〜１１．０２５ｋＨｚの低域部と、１１．０２５ｋＨｚ〜の高域部とに分離され、低域部のスペクトルデータ等は第１の量子化部３４０に、高域部のスペクトルデータ等は第２の量子化部３４５に、それぞれ出力される。
【００６２】
図３は、図１に示した第１の量子化部３４０のスケールファクター決定処理における動作を示すフローチャートである。
第１の量子化部３４０は、まず、スケールファクターの初期値として、各スケールファクターバンドに共通のスケールファクターを定め（Ｓ９１）、そのスケールファクターを用いて１フレーム分（１０２４サンプル）の音響データとして伝送されるべき低域部スペクトルデータをすべて量子化するとともに、求められたスケールファクターの前後の差分を求め、その差分と先頭のスケールファクターと各量子化値とをハフマン符号化する（Ｓ９２）。なお、ここでの量子化および符号化は、ビット数のカウントのためだけに行うので、処理を簡略化するため、データのみについて行い、ヘッダなどの情報は付加しないものとする。
【００６３】
次いで、第１の量子化部３４０は、ハフマン符号化後のデータのビット数が所定のビット数を超えたか否かを判断し（Ｓ９３）、超えていれば、スケールファクターの初期値を下げ（Ｓ１０１）、そのスケールファクターの値を用いて、同じ低域部スペクトルデータにつき、量子化とハフマン符号化とをやり直した上（Ｓ９２）、ハフマン符号化後の１フレーム分の低域部符号化データのビット数が所定のビット数を超えたか否かを判断して（Ｓ９３）、所定ビット数以下になるまでこの処理を繰り返す。
【００６４】
第１の量子化部３４０は、低域部符号化データのビット数が所定のビット数を超えていなければ、スケールファクターバンドごとに以下の処理を繰り返し、各スケールファクターバンドのスケールファクターを決定する（Ｓ９４）。まず、スケールファクターバンド内の各量子化値を逆量子化し（Ｓ９５）、それぞれの逆量子化値とそれに対応するもとのスペクトルデータとの各絶対値の差分を求めて合計する（Ｓ９６）。さらに、求められた差分の合計が許容範囲内の値であるか否かを判断し（Ｓ９７）、許容範囲内であれば、次のスケールファクターバンドにつき、上記の処理を繰り返す（Ｓ９４〜Ｓ９８）。
【００６５】
一方、許容範囲を超えていれば、スケールファクターの値を大きくして当該スケールファクターバンドのスペクトルデータを量子化するとともに（Ｓ１００）、その量子化値を逆量子化して（Ｓ９５）、逆量子化値と対応するスペクトルデータとの絶対値の差分を合計する（Ｓ９６）。さらに、差分の合計が許容範囲内かどうかを判断して（Ｓ９７）許容範囲を超えていれば、許容範囲内となるまでスケールファクターを順次大きくし（Ｓ１００）、上記の処理（Ｓ９５〜Ｓ９７およびＳ１００）を繰り返す。
【００６６】
第１の量子化部３４０は、すべてのスケールファクターバンドにつき、スケールファクターバンド内の量子化値を逆量子化した値ともとのスペクトルデータとの絶対値の差分の合計が許容範囲となるようなスケールファクターを決定すると（Ｓ９８）、決定されたスケールファクターを用いて、再度、１フレーム分の低域部スペクトルデータを量子化し、各スケールファクターの差分と先頭のスケールファクターと各量子化値とをハフマン符号化し、低域部符号化データのビット数が所定のビット数を超えているか否かを判定する（Ｓ９９）。低域部符号化データのビット数が所定のビット数を超えていれば、それが所定のビット数以下になるまでスケールファクターの初期値を下げた後（Ｓ１０１）、各スケールファクターバンド内のスケールファクターを決定する処理（Ｓ９４〜Ｓ９８）を繰り返す。低域部符号化データのビット数が所定のビット数を超えていなければ（Ｓ９９）、そのときの各スケールファクターの値を、各スケールファクターバンドのスケールファクターに決定する。
【００６７】
第１の量子化部３４０は、このように決定されたスケールファクターを用いて、低域部のスペクトルデータを量子化し、量子化値と、決定したスケールファクターの先頭と、スケールファクターの差分とを、データ分離部３３０から受け取ったサイン情報と供に第１の符号化部３５０に出力する。
【００６８】
なお、スケールファクターバンド内の量子化値を逆量子化した値ともとのスペクトルデータとの絶対値の差分の合計が許容範囲となるかどうかの判断は、聴覚心理モデルなどのデータに基づいて行われる。
【００６９】
また、ここではスケールファクターの初期値を比較的大きな数値に設定し、ハフマン符号化後の低域部符号化データのビット数が、所定のビット数を超えた場合には、順次、スケールファクターの初期値を下げていく方法でスケールファクターを決定しているが、必ずしもこのようにする必要はない。例えば、あらかじめスケールファクターの初期値を低い値に設定しておき、その初期値を徐々に増加していき、低域部符号化データの全体のビット数が所定のビット数を最初に超えた段階で、直前に設定されていたスケールファクターの初期値を用いて各スケールファクターバンドのスケールファクターを決定するようにしてもよい。
【００７０】
さらに、ここでは１フレーム分の低域部符号化データ全体のビット数が所定のビット数を超えないように各スケールファクターバンドのスケールファクターを決定したが、必ずしもこのようにしなくてよい。例えば、各スケールファクターバンドにおいて、スケールファクターバンド内の各量子化値が所定のビット数を超えないようスケールファクターを決定するようにしてもよい。以下に、図４を用いて、この処理における第１の量子化部３４０の動作を説明する。
【００７１】
図４は、図１に示した第１の量子化部３４０の他のスケールファクター決定処理における動作を示すフローチャートである。
第１の量子化部３４０は、符号化の対象となる低域部のすべてのスケールファクターバンドについて、以下の手順によりスケールファクターの計算を行う（Ｓ１）。また、第１の量子化部３４０は、各スケールファクターバンド内のすべてのスペクトルデータにつき、以下の手順によりスケールファクターの計算を行う（Ｓ２）。
【００７２】
まず、第１の量子化部３４０は、所定のスケールファクターの値で、スペクトルデータを公式に基づいて量子化し（Ｓ３）、その量子化値が量子化値を表すために与えられる所定のビット数、例えば、４ビットを超えたか否かを判定する（Ｓ４）。
【００７３】
判定の結果、量子化値が４ビットを超えている場合、スケールファクターの値を調整し（Ｓ８）、調整後のスケールファクターの値で同じスペクトルデータを量子化する（Ｓ３）。第１の量子化部３４０は、得られた量子化値が４ビットを超えたか否かを判定し（Ｓ４）、そのスペクトルデータの量化値が４ビット以下の値になるまで、スケールファクターの調整（Ｓ８）と調整後のスケールファクターによる量子化（Ｓ３）とを繰り返す。
【００７４】
判定の結果、量子化値が４ビット以下である場合、次のスペクトルデータについて、所定のスケールファクターの値で、量子化を行う（Ｓ３）。
第１の量子化部３４０は、１つのスケールファクターバンド内のすべてのスペクトルデータの量子化値が４ビット以下となると（Ｓ５）、そのときのスケールファクターの値を、そのスケールファクターバンドのスケールファクターに決定する（Ｓ６）。
【００７５】
さらに、第１の量子化部３４０は、すべてのスケールファクターバンドにつき、スケールファクターを決定すると（Ｓ７）、処理を終了する。
以上の処理により、符号化の対象となる低域部のすべてのスケールファクターバンドにつき、それぞれ１つのスケールファクターが決定される。第１の量子化部３４０は、このように決定されたスケールファクターを用いて、低域部のスペクトルデータを量子化し、量子化結果である４ビットの量子化値と、８ビットの前記スケールファクターの先頭と、スケールファクタの差分とを、データ分離部３３０から受け取ったサイン情報と供に、第１の符号化部３５０に出力する。
【００７６】
第１の符号化部３５０に出力された量子化値およびスケールファクター等は、ハフマン符号化され、ダウンサンプリングの場合と同等の第１の符号化信号としてストリーム出力部３９０に出力される。
【００７７】
一方、第２の量子化部３４５は、高域部のスペクトルデータ等に基づいて、補助情報を生成する。
図５は図１に示した第２の量子化部３４５によって生成される補助情報（スケールファクター）の具体例を示すスペクトル波形図であり、図６は図１に示した第２の量子化部３４５の補助情報（スケールファクター）計算処理における動作を示すフローチャートである。
【００７８】
なお、図５において、低域部の周波数軸上に示す区切りは、それぞれ本実施の形態において定めたスケールファクターバンドの区切りを示している。また、高域部において周波数方向に破線で示す区切りは、本実施の形態において定めた高域部のスケールファクターバンドの区切りを示している。以下の波形図においても同様である。
【００７９】
変換部３２０から出力されるスペクトルデータのうち、図５に実線の波形で示す再生帯域１１．０２５ｋＨｚ以下の低域部は、第１の量子化部３４０に出力され、従来どおり量子化される。一方、図５に破線の波形で示す再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部は、第２の量子化部３４５によって計算される補助情報（スケールファクター）によって表される。
【００８０】
以下、図５の具体例を用い、図６のフローチャートにしたがって第２の量子化部３４５の補助情報（スケールファクター）の計算手順を説明する。
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき、各スケールファクターバンドにおける絶対最大スペクトルデータの量子化値を「１」にする最適なスケールファクターを、以下の手順にしたがって計算する（Ｓ１１）。
【００８１】
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える高域部の最初のスケールファクターバンドにおける絶対最大スペクトルデータ（ピーク）を特定する（Ｓ１２）。図５の具体例において、最初のスケールファクターバンド内で特定されたピークの位置が▲１▼で、そのときのピークの値が「２５６」であったとする。
【００８２】
第２の量子化部３４５は、図４のフローチャートに示した手順と同様にして、量子化値を計算する公式にピークの値「２５６」と初期値のスケールファクター値とをあてはめ、公式から得られる量子化値が「１」となるスケールファクターｓｆの値を計算する（Ｓ１３）。例えば、この場合、ピーク値「２５６」の量子化値を「１」にするスケールファクターｓｆの値、例えばｓｆ＝２４が算出される。
【００８３】
最初のスケールファクターバンドについて、ピークの量子化値を「１」にするスケールファクターの値ｓｆ＝２４が求められると（Ｓ１４）、第２の量子化部３４５は、次のスケールファクターバンドについて、スペクトルデータのピークを特定し（Ｓ１２）、例えば、特定されたピークの位置が▲２▼で、その値が「３１２」であった場合、ピーク値「３１２」の量子化値が「１」となるスケールファクターｓｆの値、例えばｓｆ＝３２を計算する（Ｓ１３）。
【００８４】
同様にして、第２の量子化部３４５は、高域部における３番目のスケールファクターバンドについて、ピーク▲３▼の値「２８８」の量子化値を「１」にするスケールファクターｓｆの値、例えばｓｆ＝２６を計算し、４番目のスケールファクターバンドについて、ピーク▲４▼の値「２０３」の量子化値を「１」にするスケールファクターｓｆの値、例えばｓｆ＝１８を計算する。
【００８５】
このようにして、高域部のすべてのスケールファクターバンドについて、ピーク値の量子化値を「１」にするスケールファクターが計算されると（Ｓ１４）、第２の量子化部３４５は、計算によって得られた各スケールファクターバンドのスケールファクターを、高域部の補助情報として第２の符号化部３５５に出力し、処理を終了する。
【００８６】
以上のようにして第２の量子化部３４５によって補助情報（スケールファクター）が生成されるが、この補助情報（スケールファクター）は、１０２４点のスペクトルデータで表されていた高域部を、各スケールファクターの値を０〜２５５までの値で表せば、高域部における各スケールファクターバンド（ここでは４つ）につき、それぞれ８ビットで表すことができる。また、この各スケールファクターの差分をハフマン符号化するようにすれば、データ量をさらに低減できる可能性がある。これに対し、この高域部の１０２４点のスペクトルデータを低域部と同様に従来の方法で量子化およびハフマン符号化したとすると、最低でも３００ビット程度のデータ量になると予測される。したがって、この補助情報は、高域部の各スケールファクターバンドにつき１つのスケールファクターを示しているに過ぎないが、従来の方法にしたがって高域部を量子化する場合に比べて、データ量が大きく低減されていることがわかる。
【００８７】
また、このスケールファクターは、各スケールファクターバンドにおけるピーク値（絶対値）にほぼ比例した値を示しており、高域部における１０２４点で一定値をとるスペクトルデータあるいは低域部のスペクトルデータの一部または全部のコピーにスケールファクターを乗算して得られるスペクトルデータは、入力音響信号に基づいて得られたスペクトルデータを大まかに復元しているといえる。また、スケールファクターバンドごとに、バンド内にコピーされたスペクトルデータの絶対最大値と、そのバンドに対応するスケールファクター値を用いて量子化値「１」を逆量子化した値との比率を係数として、バンド内の各スペクトルデータに乗じることにより、より精度よくスペクトルデータを復元することができる。さらに、高域部の波形の相違は、低域部ほど聴覚的にはっきり識別されるものではないので、このようにして得られた補助情報は、高域部の波形を表す情報として十分であるといえる。
【００８８】
なお、ここでは、高域部の各スケールファクターバンド内のスペクトルデータの量子化値が「１」となるようスケールファクターを計算したが、必ずしも「１」である必要はなく、他の値に定めておいてもよい。
【００８９】
第２の量子化部３４５により生成された補助情報は、第２の符号化部３５５によりハフマン符号化され、ストリーム出力部３９０により第２の符号化信号としてビットストリーム中の従来の復号化装置では無視されるまたはその動作が規定されていない領域に格納される。
【００９０】
図７は、図１に示したストリーム出力部３９０によって補助情報が格納されるビットストリーム中の位置を示す図である。図７において、高域部のスペクトルを表す補助情報は、符号化された後、第２の符号化信号としてビットストリーム中の音響符号化信号として認識されない領域に格納される。
【００９１】
図７（ａ）において斜線で示す部分は、例えば、ビットストリームのデータ長を合わせるために「０」で埋められる領域（ＦｉｌｌＥｌｅｍｅｎｔ）であって、この領域に、高域部のスペクトルを表す補助情報、すなわち第２の符号化信号が格納されていても、従来の復号化装置２０００では復号化すべき符号化信号とは認識されず、無視される。
【００９２】
また、図７（ｂ）において斜線で示す部分は、例えば、ＤａｔａＳｔｒｅａｍＥｌｅｍｅｎｔ（ＤＳＥ）という領域であって、この領域は、将来の拡張のためＡＡＣの規格によってビット長などの物理的構造だけが規定された領域である。この領域は、ＦｉｌｌＥｌｅｍｅｎｔと同様、ここに高域部のスペクトルを表す補助情報が格納されていても、従来の復号化装置２０００では無視されるかまたはそのデータが読み取られたとしても読み取られたデータに対する復号化装置２０００の動作が規定されていない領域である。
【００９３】
また、以上ではＭＰＥＧ−２ＡＡＣの規格によって従来の復号化装置２０００では無視されるようなビットストリーム中の領域に第２の符号化信号を格納するとしたが、それ以外にも、ヘッダ情報の所定の位置に組み込んでもよいし、第１の符号化信号中の所定の位置に第２の符号化信号を組み込んでもよいし、その両方にまたがって組み込んでもよい。またビットストリーム中に第２の符号化信号を格納するために、ヘッダにおいても第１の符号化信号においても、連続した領域を確保しなくてもよい。すなわち、図７（ｃ）のように、ヘッダ情報と第１の符号化情報との中に、非連続に第２の符号化信号を組み込んでもよい。
【００９４】
図８は、図１に示したストリーム出力部３９０が補助情報を格納する場合の他の例を示す図である。図８（ａ）は、第１の符号化信号のみがフレームごとに連続して格納されているストリーム１を示している。図８（ｂ）は、補助情報が符号化された第２の符号化信号のみが、ストリーム１に対応するフレームごとに連続して格納されているストリーム２を示している。
【００９５】
ストリーム出力部３９０は、第２の符号化信号を、第１の符号化信号を格納したビットストリームであるストリーム１とは全く別のストリーム２に格納してもよい。例えば、ストリーム１とストリーム２とは、異なるチャンネルで伝送されるビットストリームである。
【００９６】
このように、第１の符号化信号と第２の符号化信号を全く異なるビットストリームで伝送することにより、入力音響信号の基本的な情報を表す低域部分をあらかじめ伝送または蓄積しておき、必要に応じて高域部情報を後から付加することができるという効果がある。
【００９７】
なお、図７および図８に示されるフォーマットにおいて、ヘッダに格納するビットストリームのサンプルリング周波数を示す情報には、実際のサンプルリング周波数の半分である２２．０５ｋＨｚという情報を格納しておく。これにより、従来例１の復号化装置２０００でも０〜１１．０２５ｋＨｚ帯のビットストリームを復号化し、ダウンサンプリングの場合と同様に再生することができる。
【００９８】
次いで、本発明の実施の形態に係る符号化装置３００の方式と、従来例１の符号化装置１０００の方式の相違を、図９を用いて説明する。図９は、実施の形態に係る方式と従来例１に係る方式（ダウンサンプリング方式）とを比較して示す図であり、特に図９（ａ）は実施の形態に係る方式を示す図であり、図９（ｂ）は従来例１に係る方式を示す図である。
【００９９】
本実施の形態に係る方式においては、サンプリング周波数を４４．１ｋＨｚとすることにより、２２．７μｓｅｃごとの音響データ列を取得し、符号化対象のフレームに含まれる２０４８個と、その前後の１０２４個ずつとの計４０９６個を切り出し、ＭＤＣＴすることにより２０４８個のスペクトルデータを取得している。このスペクトルデータの再生帯域は、２２．０５ｋＨｚを表している。この２０４８個のスペクトルデータは、１１．０２５ｋＨｚを境界として低域部のスペクトルデータ（１０２４個）と、高域部のスペクトルデータ（１０２４個）とに分離される。低域部のスペクトルデータ（１０２４個）は、通常の量子化・符号化が行われ、ダウンサンプリングと同等の高品質で低ビットレートの第１の符号化信号が取得される。しかも、高域部についても１０２４個のスペクトルデータが取得されている。これを、通常の量子化・符号化すると、低ビットレートを実現できない。そこで、本実施の形態に係る方式においては、高域部の１０２４個のスペクトルデータから補助情報を生成し、補助情報のみ符号化することにより第２の符号化信号を取得している。したがって、情報量の合計が、大幅に増加しない範囲で高品質の音響信号を符号化することができる。
【０１００】
これに対して、従来例１のダウンサンプリングによる方式においては、サンプリング周波数を２２．０５ｋＨｚとすることにより、４５μｓｅｃごとの音響データ列を取得し、符号化対象のフレームに含まれる１０２４個と、その前後の５１２個ずつとの計２０４８個を切り出し、ＭＤＣＴすることにより１０２４個のスペクトルデータを取得している。このスペクトルデータの再生帯域は、１１．０２５ｋＨｚを表している。この１０２４個のスペクトルデータは、通常の量子化・符号化が行われる。したがって、１１．０２５ｋＨｚ以下については品質の高い符号化信号を取得できるものの、１１．０２５ｋＨｚを超える高域部のスペクトルデータが全くないため、高域部の符号化信号を取得できなかった。
【０１０１】
次いで、本発明の実施の形態に係る符号化装置３００の方式と、従来例２の符号化装置１０００の方式の相違を、図１０を用いて説明する。
図１０は、実施の形態に係る方式と従来例２に係る方式とを比較して示す図であり、特に図１０（ａ）は実施の形態に係る方式を示す図であり、図１０（ｂ）は従来例２に係る方式を示す図である。なお、本実施の形態に係る方式については、上述したので、その説明を省略する。
【０１０２】
従来例２のサンプリングによる方式においては、本実施の形態の場合と同様にサンプリング周波数を４４．１ｋＨｚとすることにより、２２．７μｓｅｃごとの音響データ列を取得する一方、符号化対象のフレームに含まれる１０２４個と、その前後の５１２個ずつとの計２０４８個を切り出し、ＭＤＣＴすることにより１０２４個のスペクトルデータを取得している。このスペクトルデータの再生帯域は、２２．０５ｋＨｚを表している。この１０２４個のスペクトルデータは、通常の量子化・符号化が行われる。すなわち、本実施の形態の半分の時間（約２２．７ｍｓｅｃ）ごとに、１０２４個のスペクトルデータ（１１．０２５ｋＨｚ以下の低域では５１２個、１１．０２５ｋＨｚを超える高域では５１２個）を取得している。
【０１０３】
ここで、従来例２の符号化装置１０００において本発明の実施の形態と同様に１１．０２５〜２２．０５ｋＨｚの高域のスペクトルデータから補助情報を生成した場合を仮定する。この場合において、約２２．７ｍｓｅｃごとに量子化で使用できるビット数をｎとし、補助情報として使用できるビット数をｍ１とすると、低域（０〜１１．０２５ｋＨｚ）の５１２サンプルを（ｎ−ｍ１）ビットで量子化する必要がある。これに対して、本発明の実施の形態の場合、量子化で使用できる約４５．４ｍｓｅｃごとのビット数が２×ｎとし、補助情報として使用できるビット数をｍ２とすると、低域（０〜１１．０２５ｋＨｚ）の１０２４サンプルを（２×ｎ−ｍ２）ビットで量子化すればよい。
【０１０４】
ところで、ＡＡＣにおいては一般的に、あるサンプル数（しきい値）以上集まらないと、高い符号化効率を得られないことが知られており、従来例２の５１２サンプルの場合にはしきい値に達せず、本実施の形態の１０２４サンプルの場合にはしきい値を十分に超える。
【０１０５】
したがって、従来例２のように５１２サンプルを（ｎ−ｍ１）ビットで量子化するより、本実施の形態のように１０２４サンプルを（２×ｎ−ｍ２）ビットで量子化する方が、より高い符号化効率が得られる。また、本実施の形態の方が高い符号化効率を得られる結果、ｍ２をより大きくでき（ｍ２＞２×ｍ１）、高域部の音質を高めることが可能となる。
【０１０６】
図１１は、本実施の形態の符号化方式と、従来例１，２の符号化方式とのスペクトルデータおよび特徴について、比較した図である。
本実施の形態では、サンプリング周波数４４．１ｋＨｚで、フレーム長を２０４８個としている。このため、スペクトルデータでは０〜１１．０２５ｋＨｚの低域部に１０２４個のスペクトルデータと、１１．０２５〜２２．０５ｋＨｚの高域部の１０２４個のスペクトルデータに基づく補助情報とを取得している。この結果、本実施の形態では、帯域の面では、従来例２に対して略同一であるが、従来例１に対して広くなる。また、本実施の形態では、音質の面では、従来例１に対し、０〜１１．０２５ｋＨｚの低域部については同一品質であるが、１１．０２５〜２２．０５ｋＨｚの高域部については補助情報があるため、全体として高品質となり、従来例２に対し、１１．０２５〜２２．０５ｋＨｚの高域部については補助情報があるため略同一品質を得ることができ、しかも０〜１１．０２５ｋＨｚの低域部についてはスペクトルデータの数が倍加されている分高品質であるので、全体として高品質となる。
【０１０７】
これに対して従来例１では、サンプリング周波数２２．０５ｋＨｚで、フレーム長を１０２４としており、スペクトルデータでは０〜１１．０２５ｋＨｚの帯域に１０２４個のスペクトルデータを取得している。この結果従来例１では、本実施の形態に対して、帯域の面では半減し、狭くなる。このため、音質の面では、０〜１１．０２５ｋＨｚの低域部については同一品質であるが、１１．０２５〜２２．０５ｋＨｚの高域部についてはスペクトルデータが全くないので悪くなり、全体として低くなる。
【０１０８】
また、従来例２では、サンプリング周波数４４．１ｋＨｚで、フレーム長を１０２４としており、スペクトルデータでは０〜２２．０５ｋＨｚの全帯域に１０２４個のスペクトルデータを取得している。この結果従来例２では、本実施の形態に対して、帯域の面では同じであるが、音質の面では、１１．０２５〜２２．０５ｋＨｚの高域部についてはスペクトルデータが符号化されるので高品質となるものの、帯域０〜１１．０２５ｋＨｚの低域部についてはスペクトルデータの個数が半減するため、品質が落ち、全体として品質は低くなる。
【０１０９】
したがって、本実施の形態によれば、低域部は従来の符号化を行い、高域部を極めて少ない情報量で符号化を行うことにより、情報量の合計が、従来と比べて大幅に増加しない範囲で高品質の音響信号を符号化することができる。
【０１１０】
次いで、放送システム１における復号化装置４００の各部の符号化処理を具体的に説明する。
ストリーム入力部４１０から出力された第１の符号化信号は、第１の復号化部４２０により量子化データ等に復号化され、第１の逆量子化部４３０により低域部のスペクトルデータに符号化される。一方、ストリーム入力部４１０から出力された第２の復号化信号は、第２の復号化部４２５により補助情報に復号化される。第２の逆量子化部４３５は、補助情報に基づいて高域部のスペクトルデータを生成する。この第２の逆量子化部４３５における処理を詳細に説明する。
【０１１１】
図１２は、図１に示した第２の逆量子化部４３５によって低域部１０２４スペクトルが順方向に高域部にコピーされる手順を示すフローチャートである。このような低域部のスペクトルデータのコピーは、高域部のスペクトルデータを生成するに際して実行される。
【０１１２】
図１２において、ｉｎｖ＿ｓｐｅｃ１［ｉ］は、第１の逆量子化部４３０の出力データのうちのｉ番目のスペクトルの値を示し、ｉｎｖ＿ｓｐｅｃ２［ｊ］は、第２の逆量子化部４３５の入力データのうちのｊ番目のスペクトルの値を示している。
【０１１３】
まず、第２の逆量子化部４３５は、０番目のスペクトルから１０２３番目のスペクトルまでを同方向に入力するため、スペクトルの数をカウントするカウンタｉ、ｊの初期値をそれぞれ「０」にセットする（Ｓ７１）。次いで、第２の逆量子化部４３５は、カウンタｉの値が「１０２４」未満であるか否かを調べ（Ｓ７２）、カウンタｉの値が「１０２４」未満であれば、第１の逆量子化部４３０の低域部ｉ番目（この場合、０番目）のスペクトルの値を、第２の逆量子化部４３５の高域部ｊ番目（この場合、０番目）のスペクトルの値として入力する（Ｓ７３）。この後、第２の逆量子化部４３５は、カウンタｉ、ｊの値をそれぞれ「１」だけインクリメントし（Ｓ７４）、カウンタｉの値が「１０２４」未満であるか否かを調べる（Ｓ７２）。
【０１１４】
第２の逆量子化部４３５は、カウンタｉの値が「１０２４」未満である間、上記処理を繰り返し、カウンタｉの値が「１０２４」以上になると、処理を終了する。
【０１１５】
この結果、第１の逆量子化部４３０の逆量子化結果である０〜１０２３番目の低域部の全スペクトルが、そのまま第２の逆量子化部４３５の高域部のスペクトルとしてコピーされる。
【０１１６】
このようにしてコピーされたスペクトルデータは、第２の復号化部４２５によって復号化された補助情報、すなわちピーク値を「１」にするスケールファクターの値に応じてコピーしたスペクトルデータの振幅が調整され、高域部のスペクトルデータとして出力される。なお、振幅の調整は、スケールファクターバンドごとに、バンド内にコピーされたスペクトルデータの絶対最大値と、そのバンドに対応するスケールファクター値を用いて量子化値「１」を逆量子化した値との比率を係数として、バンド内の各スペクトルデータに乗じることで達成される。ここで、第２の逆量子化部４３５が出力するスペクトルデータの最大サンプル数は１０２４であり、これらは１１．０２５を超える再生帯域を表す。
【０１１７】
なお、図１２においては、低域部１０２４スペクトルを周波数軸方向の順方向に高域部にコピーされる手順を示したが、図１３に示すようにこれと逆方向にコピーしてもよい。
【０１１８】
図１３は、図１に示した第２の逆量子化部４３５によって低域部１０２４スペクトルが周波数軸方向の逆方向に高域部にコピーされる手順を示すフローチャートである。図１２と同様、図１３において、ｉｎｖ＿ｓｐｅｃ１［ｉ］は、第１の逆量子化部４３０の出力データのうちのｉ番目のスペクトルの値を示し、ｉｎｖ＿ｓｐｅｃ２［ｊ］は、第２の逆量子化部４３５の入力データのうちのｊ番目のスペクトルの値を示している。
【０１１９】
まず、第２の逆量子化部４３５は、０番目のスペクトルから１０２３番目のスペクトルまでを逆方向に入力するため、スペクトルの数をカウントするカウンタｉの初期値を「０」に、ｊの初期値を「１０２３」にセットする（Ｓ８１）。次いで、第２の逆量子化部４３５は、カウンタｉの値が「１０２４」未満であるか否かを調べ（Ｓ８２）、カウンタｉの値が「１０２４」未満であれば、第１の逆量子化部４３０の低域部ｉ番目（この場合、０番目）のスペクトルの値を、第２の逆量子化部４３５の高域部ｊ番目（この場合、１０２３番目）のスペクトルの値として入力する（Ｓ８３）。この後、第２の逆量子化部４３５は、カウンタｉの値を「１」だけインクリメントし、ｊの値を「１」だけデクリメントして（Ｓ８４）、カウンタｉの値が「１０２４」未満であるか否かを調べる（Ｓ８２）。
【０１２０】
第２の逆量子化部４３５は、カウンタｉの値が「１０２４」未満である間、上記処理を繰り返し、カウンタｉの値が「１０２４」以上になると、処理を終了する。
【０１２１】
この結果、第１の逆量子化部４３０の逆量子化結果である０〜１０２３番目の低域部の全スペクトルが、第２の逆量子化部４３５の高域部の１０２３〜０番目のスペクトルとして逆方向にコピーされる。
【０１２２】
この場合においても、コピーされたスペクトルデータは、第２の復号化部４２５によって復号化された補助情報、すなわちピーク値を「１」にするスケールファクターの値に応じてコピーしたスペクトルデータの振幅が調整され、高域部のスペクトルデータとして出力される。なお、振幅の調整は、スケールファクターバンドごとに、バンド内にコピーされたスペクトルデータの絶対最大値と、そのバンドに対応するスケールファクター値を用いて量子化値「１」を逆量子化した値との比率を係数として、バンド内の各スペクトルデータに乗じることで達成される。ここで、第２の逆量子化部４３５が出力するスペクトルデータの最大サンプル数は１０２４であり、これらは１１．０２５を超える再生帯域を表す。
【０１２３】
なお、第２の逆量子化部４３５は低域部におけるすべてのスペクトルデータを高域部にコピーしたが、一部のみコピーしてもよい。
また、高域部と低域部の全体を一度にコピーする手順として図１２および図１３の場合を例として挙げたが、一部図１２のようにコピーし、一部図１３のようにコピーしてもよい。
【０１２４】
また、それらの一部、または全部を正負の符号を反転してコピーしてもよい。さらに、これらのコピー手順は、あらかじめ決めておいてもよいし、低域部のデータに応じて変更してもよいし、補助情報として伝送してもよい。
【０１２５】
また、ここでは、低域部側のスペクトルデータを高域部側のスペクトルデータとしてコピーしているが、これに限らず、高域部側のスペクトルデータは第２の符号化情報のみから生成してもよい。
【０１２６】
さらに、本実施の形態においては、第２の逆量子化部４３５におけるノイズ生成として、主として第１の逆量子化部４３０から得られるスペクトルデータをコピーする場合について説明したが、これに限ったものでなく、高域部の各スケールファクターバンド内において一定値をもつスペクトルデータ、ホワイトノイズ、およびピンクノイズなどを、第２の逆量子化部４３５で独自に生成してもよいし、補助情報に応じて生成してもよい。
【０１２７】
第２の逆量子化部４３５から出力された１０２４個のスペクトルデータは、逆量子化データ合成部４４０において第１の逆量子化部４３０から出力されたスペクトルデータ（１０２４個）と合成され、ＩＭＤＣＴされることにより時間軸上の音響データに逆変換され、サンプリング周波数４４．１ｋＨｚでＤ／Ａ変換されることにより再生帯域０〜２２．０５ｋＨｚの音声信号が再生される。
【０１２８】
以上のように、本発明では従来の２倍の変換長をもつＭＤＣＴおよびＩＭＤＣＴを用いて、２０４８サンプルのスペクトルデータに対して前半１０２４サンプルのみを対象として従来の符号化を行い、残りの後半１０２４サンプルについては従来と比べて少ない情報量での符号化を行い、復号時にそれらを合成した。
【０１２９】
後半１０２４サンプルのスペクトルデータの符号化に必要な情報量を削減できることで、前半１０２４サンプルのスペクトルデータの符号化に必要な情報量を増加させることができ、低域部の原信号に対する精度を向上させつつも、広帯域な符号化を行うことができた。
【０１３０】
また本実施例のおける符号化装置が生成したビットストリームは、従来の復号化装置でも復号することができる。
次いで、補助情報およびその復号化の種々変形例について説明する。
【０１３１】
図１４は、図１に示した第２の量子化部３４５によって生成される他の補助情報（量子化値）の具体例を示すスペクトル波形図である。また、図１５は、図１に示した第２の量子化部３４５の他の補助情報（量子化値）計算処理における動作を示すフローチャートである。
【０１３２】
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき共通のスケールファクター値、例えば「１８」をあらかじめ定めておき、そのスケールファクター値「１８」を用いて、スケールファクターバンドごとに、そのスケールファクターバンドにおける絶対最大値スペクトルデータ（ピーク）の量子化値を計算する（Ｓ２１）。
【０１３３】
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える高域部の最初のスケールファクターバンドにおける絶対最大スペクトルデータ（ピーク）を特定する（Ｓ２２）。図１４の具体例において、最初のスケールファクターバンド内で特定されたピークの位置が▲１▼で、そのときのピークの値が「２５６」であったとする。
【０１３４】
第２の量子化部３４５は、量子化値を計算する公式に、あらかじめ定めた共通のスケールファクター値「１８」とピークの値「２５６」とをあてはめ、量子化値を計算する（Ｓ２３）。例えば、この場合、ピーク値「２５６」をスケールファクター値「１８」で量子化すると、量子化値「６」が算出される。
【０１３５】
最初のスケールファクターバンドについて、ピーク値「２５６」の量子化値「６」が求められると（Ｓ２４）、第２の量子化部３４５は、次のスケールファクターバンドについて、スペクトルデータのピークを特定し（Ｓ２２）、例えば、特定されたピークの位置が▲２▼で、その値が「３１２」であった場合、スケールファクターの値を「１８」とするピーク値「３１２」の量子化値、例えば「１０」を計算する（Ｓ２３）。
【０１３６】
同様にして、第２の量子化部３４５は、高域部における３番目のスケールファクターバンドについて、スケールファクターの値を「１８」とするピーク▲３▼の値「２８８」の量子化値「９」を計算し、４番目のスケールファクターバンドについて、スケールファクターの値を「１８」とするピーク▲４▼の値「２０３」の量子化値「５」を計算する。
【０１３７】
このようにして、高域部のすべてのスケールファクターバンドについて、スケールファクターを「１８」に固定した場合のピーク値の量子化値が計算されると（Ｓ２４）、第２の量子化部３４５は、計算によって得られた各スケールファクターバンドの量子化値を、高域部の補助情報として第２の符号化部３５５に出力し、処理を終了する。
【０１３８】
以上のようにして第２の量子化部３４５によって補助情報（量子化値）が生成されるが、この補助情報（量子化値）は、１０２４点のスペクトルデータで表されていた高域部を、４つのスケールファクターバンドにつき、それぞれ４ビットの量子化値で表している。これに対し、前述の補助情報（スケールファクター）では、高域部を、４つのスケールファクターバンドにつき、それぞれ８ビットのスケールファクターで表していたので、これと比較すると、高域部のデータ量がより低減されている。また、この量子化値は、各スケールファクターバンドにおけるピーク値（絶対値）の振幅を大まかに表しており、高域部における１０２４点で一定値をとるスペクトルデータあるいは低域部のスペクトルデータの一部または全部のコピーに、これを単純に乗算して得られるスペクトルデータであっても、入力音響信号に基づいて得られたスペクトルデータを大まかに復元しているといえる。また、スケールファクターバンドごとに、バンド内にコピーされたスペクトルデータの絶対最大値と、あらかじめ定められていたスケールファクター値を用いてそのバンドに対応する量子化値を逆量子化した値との比率を係数として、バンド内の各スペクトルデータに乗じることにより、さらに精度よくスペクトルデータを復元することができる。
【０１３９】
なお、この実施の形態では、第２の符号化情報として伝送される量子化値に対応するスケールファクター値は、あらかじめ定めたものにしたが、最適なスケールファクター値を計算し、第２の符号化情報に付加して伝送してもよい。例えば、量子化値の最大値が７となるようにスケールファクターを選択すれば、量子化値を表すビット数が３ビットですむので、量子化値の伝送に必要な情報量はより少なくてすむ。
【０１４０】
図１６は、図１に示した第２の量子化部３４５によって生成される他の補助情報（位置情報）の具体例を示すスペクトル波形図である。また、図１７は、図１に示した第２の量子化部３４５の他の補助情報（位置情報）計算処理における動作を示すフローチャートである。
【０１４１】
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき、各スケールファクターバンドにおける絶対最大スペクトルデータの位置を以下の手順にしたがって特定する（Ｓ３１）。
【０１４２】
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える高域部の最初のスケールファクターバンドにおける絶対最大スペクトルデータ（ピーク）を特定する（Ｓ３２）。図１６の具体例において、最初のスケールファクターバンド内で特定されたピークの位置が▲１▼で、このスケールファクターバンドの先頭から２２番目のスペクトルデータであったとする。第２の量子化部３４５は、特定されたピークの位置「スケールファクターバンドの先頭から２２番目のスペクトルデータ」を保持する（Ｓ３３）。
【０１４３】
最初のスケールファクターバンドについて、ピークの位置が特定され、保持されると（Ｓ３４）、第２の量子化部３４５は、次のスケールファクターバンドについて、スペクトルデータのピークを特定する（Ｓ３２）。例えば、特定されたピークの位置が▲２▼で、スケールファクターバンドの先頭から６０番目のスペクトルデータであったとする。第２の量子化部３４５は、特定されたピークの位置「スケールファクターバンドの先頭から６０番目のスペクトルデータ」を保持する（Ｓ３３）。
【０１４４】
以下、同様にして、第２の量子化部３４５は、高域部における３番目のスケールファクターバンドについて、ピーク▲３▼の位置「スケールファクターバンドの先頭のスペクトルデータ」を特定して保持するとともに、４番目のスケールファクターバンドについて、ピーク▲４▼の位置「スケールファクターバンドの先頭から２５番目のスペクトルデータ」を特定して保持する。
【０１４５】
このようにして、高域部のすべてのスケールファクターバンドについて、ピークの位置が特定され、保持されると（Ｓ３４）、第２の量子化部３４５は、保持していた各スケールファクターバンドのピークの位置を、高域部の補助情報として第２の符号化部３５５に出力し、処理を終了する。
【０１４６】
以上のようにして第２の量子化部３４５によって補助情報（位置情報）が生成されるが、この補助情報（位置情報）は、１０２４点のスペクトルデータで表されていた高域部を、４つのスケールファクターバンドにつき、それぞれ６ビットの位置情報で表している。
【０１４７】
この場合、復号化装置４００において、第２の逆量子化部４３５は、低域部の１０２４サンプル分のスペクトルデータの一部または全部を、第２の復号化部４２５から入力された補助情報（位置情報）に応じて高域部側の１０２４サンプルデータとしてコピーする。コピーの手順は、１つ以上のスケールファクターバンドにおけるスペクトルデータのピーク情報をもとに、類似したデータを第１の逆量子化部４３０より出力されたスペクトルデータより抽出し、その一部または全部をコピーすることで達成される。また、第２の逆量子化部４３５においては、必要に応じてコピーしたスペクトルデータの振幅を調整する。振幅の調整は各スペクトルデータにあらかじめ定められた係数、例えば「０．５」として、この係数を乗じることで達成する。この係数は固定値でもよいし、帯域ごと、あるいはスケールファクターバンドごとに変更してもよいし、第１の逆量子化部４３０より出力されるスペクトルデータに応じて変更してもよい。
【０１４８】
また、上記ではあらかじめ定めた係数を用いるが、補助情報として、この係数の値を第２の符号化情報内に付加してもよい。または係数としてスケールファクター値を第２の符号化情報に付加してもよいし、係数としてスケールファクターバンド内のピークの量子化値を第２の符号化情報に付加してもよい。また振幅調整方法はこれに限ったものではなく、他の方法を用いてもよい。
【０１４９】
なお、この実施の形態では、補助情報として位置情報のみ、または位置情報と係数情報のみを符号化したが、これに限ったものでなくてよく、スケールファクター、量子化値、スペクトルのサイン情報およびノイズ生成方法等を符号化してもよい。また、これらを２つ以上組み合わせて符号化してもよい。
【０１５０】
また、低域部側のスペクトルデータを高域部側のスペクトルデータとしてコピーしているが、これに限らず、高域部側のスペクトルデータは第２の符号化情報のみから生成してもよい。
【０１５１】
図１８は、図１に示した第２の量子化部３４５によって生成される他の補助情報（サイン情報）の具体例を示すスペクトル波形図である。また、図１９は、図１に示した第２の量子化部３４５の他の補助情報（サイン情報）計算処理における動作を示すフローチャートである。
【０１５２】
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき、各スケールファクターバンドのあらかじめ定めた位置、例えばスケールファクターバンド中央におけるスペクトルデータのサイン情報を以下の手順にしたがって特定する（Ｓ４１）。
【０１５３】
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える高域部の最初のスケールファクターバンドの中央位置におけるスペクトルデータのサイン情報を調べ（Ｓ４２）、その値を保持する。例えば、最初のスケールファクターバンドの中央位置におけるスペクトルデータのサイン符号は、「＋」である。第２の量子化部３４５は、この符号「＋」を１ビットの値「１」で表して保持する。また、この符号が「−」であった場合は、「０」で表して保持する。
【０１５４】
最初のスケールファクターバンドについて、スケールファクターバンドの中央位置におけるスペクトルデータのサイン情報が保持されると（Ｓ４３）、第２の量子化部３４５は、次のスケールファクターバンドについて、中央位置におけるスペクトルデータの符号を調べる（Ｓ４２）。例えば、調べられた符号が「＋」であったとすると、第２の量子化部３４５は、２番めのスケールファクターバンドの中央位置におけるスペクトルデータのサイン情報として「１」を保持する。
【０１５５】
以下同様にして、第２の量子化部３４５は、高域部における３番目のスケールファクターバンド中央位置におけるスペクトルデータの符号「＋」を調べ、そのサイン情報「１」を保持するとともに、４番目のスケールファクターバンド中央位置におけるスペクトルデータの符号「＋」を調べ、そのサイン情報「１」を保持する。
【０１５６】
このようにして、高域部のすべてのスケールファクターバンドについて、中央位置のスペクトルデータのサイン情報が保持されると（Ｓ４３）、第２の量子化部３４５は、保持していた各スケールファクターバンドのサイン情報を、高域部の補助情報として第２の符号化部３５５に出力し、処理を終了する。
【０１５７】
以上のようにして第２の量子化部３４５によって補助情報（サイン情報）が生成されるが、この補助情報（サイン情報）は、１０２４点のスペクトルデータで表されていた高域部を、４つのスケールファクターバンドにつき、それぞれ１ビットのサイン情報で表しており、非常に短いデータ長で高域部のスペクトルを表すことができる。
【０１５８】
この場合、復号化装置４００において、第２の逆量子化部４３５は、低域部の１０２４サンプル分のスペクトルデータの一部または全部を高域部側スペクトルとしてコピーし、第２の復号化部４２５から入力されたサイン情報に応じて、あらかじめ定められた位置のスペクトルデータの符号を決定する。
【０１５９】
なお、ここでは、高域部の各スケールファクターバンド中央位置の符号を表したサイン情報を補助情報（サイン情報）としたが、スケールファクターバンド中央の位置に限定されず、例えば、各ピーク位置のサイン情報であってもよいし、スケールファクターバンド先頭のサイン情報であってもよいし、それ以外の所定の位置でもよい。
【０１６０】
またここでは、伝送する符号（サイン情報）に対応するスペクトルデータの位置はあらかじめ定めたものになっているが、これは第１の逆量子化部４３０の出力に応じて変更してもよいし、各スケールファクターバンドのサイン情報がどの位置のサイン情報であるかを示す位置情報を、第２の符号化情報に付加して伝送してもよい。
【０１６１】
また第２の逆量子化部４３５においては、必要に応じてコピーしたスペクトルデータの振幅を調整する。振幅の調整は、各スペクトルデータにあらかじめ決められた係数、例えばその値を「０．５」として、その係数を乗じることで達成できる。
【０１６２】
この係数は固定値でもよいし、帯域ごとに、あるいはスケールファクターバンドごとに変更してもよいし、第１の逆量子化部４３０より出力されるスペクトルデータに応じて変更してもよい。また振幅調整方法はこれに限ったものではなく、他の方法を用いてもよい。
【０１６３】
なお、この実施の形態においては、あらかじめ定めた係数を用いたが、この係数の値を補助情報として第２の符号化情報に付加してもよい。また、その係数としてスケールファクター値を第２の符号化情報に付加してもよいし、係数として量子化値を第２の符号化情報に付加してもよい。
【０１６４】
さらに、補助情報としてサイン情報のみ、またはサイン情報と係数情報とのみ、またはサイン情報と位置情報とのみ、またはサイン情報と位置情報と係数情報とのみを符号化したが、これに限ったものでなく、量子化値、スケールファクター、特徴的なスペクトルの位置情報、およびノイズ生成方法等を符号化してもよい。またこれらを２つ以上組み合わせて符号化してもよい。
【０１６５】
また、本実施の形態においては、低域部側のスペクトルデータを高域部側のスペクトルデータとしてコピーしているが、これに限らず、高域部側のスペクトルデータは第２の符号化情報のみから生成してもよい。
【０１６６】
また、上記では、この符号「＋」を１ビットの値「１」で表し、符号「−」をで「０」で表したが、補助情報（サイン情報）における符号の表し方は、これに限定されず、他の値で表してもよい。
【０１６７】
図２０は、図１に示した第２の量子化部３４５によって生成される他の補助情報（コピー情報）の作成方法の一例を示すスペクトル波形図である。図２０（ａ）は、高域部の最初のスケールファクターバンドにおけるスペクトルを示す波形図である。図２０（ｂ）は、補助情報（コピー情報）によって特定される低域部のスペクトル波形の一例を示す波形図である。また、図２１は、図１に示した第２の量子化部３４５の他の補助情報（コピー情報）計算処理における動作を示すフローチャートである。
【０１６８】
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき、そのスケールファクターバンド先頭からのピークの位置ｎ（先頭からｎ番目）に対し、低域部においてスケールファクターバンド先頭からのピークの位置がｎに最も近い値となるスケールファクターバンドの番号Ｎを、以下の手順にしたがって特定する（Ｓ５１）。
【０１６９】
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える高域部の最初のスケールファクターバンドにおける絶対最大スペクトルデータ（ピーク）の位置ｎを特定する（Ｓ５２）。その結果、例えば、図２０（ａ）に示すように、特定されたピークの位置が▲１▼で、そのスペクトルがこのスケールファクターバンドのｎ＝２２のスペクトルデータであったとする。
【０１７０】
第２の量子化部３４５は、スペクトルの周波数が再生帯域１１．０２５ｋＨｚ以下の低域部におけるスペクトルのすべての（正負の両方を含む）ピークの位置を特定する（Ｓ５３）。
【０１７１】
次いで、第２の量子化部３４５は、低域部で特定されたすべてのピークについて、ピークからスケールファクターバンドの先頭までの位置がｎに最も近いスケールファクターバンドをサーチし、そのスケールファクターバンドの番号Ｎと、そのサーチの方向とピークのサイン情報とを特定する（Ｓ５４）。
【０１７２】
具体的には、第２の量子化部３４５は、特定された（正負の両方を含む）全ピークにつき、低周波側のピークから順次、そのピークからの位置がｎに最も近いスケールファクターバンドの先頭をサーチする。サーチの方向は、ピークからさらに低周波の方向に向かってサーチする場合（１）と、ピークからさらに高周波の方向に向かってサーチする場合（２）との２通りがある。また、高域部のピークと正負の符号が反転している低域部のピークについても、ピークからさらに低周波の方向に向かってサーチする場合（３）と、ピークからさらに高周波の方向に向かってサーチする場合（４）との２通りがある。
【０１７３】
これらのうち、サーチ方向が（２）と（４）の場合には、このピーク情報に基づいて低域部のスペクトル波形をコピーした場合には、図２０（ｂ）に示すように高域部のピークの位置と低域部のピークの位置とがスケールファクターバンド内で左右（周波数軸方向）に反転した波形がコピーされるため、例えば（１）と（３）とのサーチ方向を順方向とし、（２）と（４）とを逆方向として、サーチ方向の順逆を表す情報を添付することが必要である。また、サーチ方向が（３）と（４）との場合は、図２０（ｂ）に示すように高域部のピークの位置と低域部のピークの位置とが上下（縦軸方向）に反転した波形がコピーされるため、高域部のピーク値と低域部のピーク値との正負の符号が反転しているか否かを示す情報を添付することが必要である。
【０１７４】
第２の量子化部３４５は、低域部で特定されたピークが正の値をとるピークであれば（１）と（２）とのサーチ方向で、低域部で特定されたピークが負の値をとるピークであれば（３）と（４）との合わせて４通りの方向についてサーチを行い、そのサーチ結果のうち、ピークからの位置がｎに最も近いスケールファクターバンドの番号を特定する。この場合、あらかじめｎとの誤差範囲を所定の値、例えば「５」に設定しておき、前記４通りのサーチ結果のうちから、ピークからの位置がｎに最も近いスケールファクターバンドを選択して、そのスケールファクターバンドの番号Ｎを特定する。併せて、高域部のピーク値と低域部のピーク値との正負の符号が反転しているか否かを示すサイン情報と、サーチ方向の順逆を表す情報とを特定する。
【０１７５】
例えば、サーチ方向（１）では、図２０（ｂ）の（１）に示すような低域部のスペクトルに対応して、ピークからの位置の誤差「１」で、スケールファクターバンドの番号Ｎ＝３が特定されたとする。また、サーチ方向（２）では、図２０（ｂ）の（２）に示すような低域部のスペクトルに対応して、ピークからの位置の誤差「５」で、スケールファクターバンドの番号Ｎ＝１８が特定されたとし、同様に、サーチ方向（３）では、図２０（ｂ）の（３）に示すような低域部のスペクトルに対応して、誤差「４」で、スケールファクターバンドの番号Ｎ＝１２、サーチ方向（４）では、図２０（ｂ）の（４）に示すような低域部のスペクトルに対応して、誤差「２」で、スケールファクターバンドの番号Ｎ＝１０が特定されたとする。第２の量子化部３４５は、特定されたスケールファクターバンドの番号４つのうち、ピークからの位置の誤差が「１」で、ピークからの位置がｎに最も近いスケールファクターバンドの番号Ｎ＝３を選択する。これと併せて、低域部のピークの符号「＋」を表すサイン情報「１」と、ピークからさらに低周波の方向に向かってサーチしたことを表すサーチ方向情報「１」とを生成する。この場合、ピークの符号が「−」であればサイン情報を「１」とし、ピークからさらに高周波の方向に向かってサーチした場合は、サーチ方向情報を「０」として表す。
【０１７６】
高域部の最初のスケールファクターバンドについて、スケールファクターバンドの番号Ｎ＝３とサイン情報「１」とサーチ方向情報「１」とが特定されると（Ｓ５５）、第２の量子化部３４５は、上記と同様にして次のスケールファクターバンドについて、スケールファクターバンドの番号Ｎとそのサイン情報とそのサーチ方向情報とを特定する。
【０１７７】
このようにして、高域部のすべてのスケールファクターバンドについて、そのスケールファクターバンドにおける先頭からのピークの位置ｎに対し、スケールファクターバンド先頭からのピークの位置がｎに最も近い値となる低域部のスケールファクターバンドの番号Ｎとそのサイン情報とそのサーチ方向情報とが特定されると（Ｓ５５）、第２の量子化部３４５は、特定された高域部の各スケールファクターバンドに対応する低域部のスケールファクターバンドの番号Ｎとサイン情報とサーチ方向情報とを高域部の補助情報（コピー情報）として第２の符号化部３５５に出力し、処理を終了する。
【０１７８】
この場合、復号化装置４００において、第１の符号化信号を従来の手順にしたがって復号化すると、低域部側の１０２４サンプルのスペクトルデータが得られる。第２の逆量子化部４３５では、第２の復号化部４２５から出力されたスケールファクターバンド番号に該当するスペクトルデータの一部または全部を高域部側スペクトルとしてコピーする。また第２の逆量子化部４３５においては、必要に応じてコピーしたスペクトルデータの振幅を調整する。振幅の調整は、各スペクトルにあらかじめ決められた係数、例えばその値を「０．５」として、その係数を乗じることで達成できる。
【０１７９】
この係数は固定値でもよいし、帯域ごと、スケールファクターバンドごとに変更してもよいし、第１の逆量子化部４３０より出力されるスペクトルデータに応じて変更してもよい。
【０１８０】
なお、この実施の形態では、振幅の調整に、あらかじめ定めた係数を用いたが、この係数の値を補助情報として第２の符号化情報に付加してもよい。また係数としてスケールファクター値を第２の符号化情報に付加してもよいし、係数として量子化値を第２の符号化情報に付加してもよい。また振幅調整方法はこれに限ったものではなく、他の方法を用いてもよい。
【０１８１】
また、高域部の補助情報（コピー情報）としてスケールファクターバンドの番号Ｎの他にそのサイン情報とサーチ方向情報とを抽出したが、高域部について伝送可能な情報量に応じて、サイン情報とサーチ方向情報とは省略してもよい。また、サイン情報は、低域部のピークの符号が「＋」であれば「１」、「−」であれば「０」とし、サーチ方向情報は、ピークからさらに低周波の方向に向かってサーチした場合は「１」、ピークからさらに高周波の方向に向かってサーチした場合は「０」として表したが、サイン情報における低域部のピークの符号およびサーチ方向情報のサーチ方向の表し方は、それぞれこれらに限定されず、他の値で表してもよい。
【０１８２】
また、低域部において特定された各ピークの位置からその距離がｎに最も近い値となるスケールファクターバンドの先頭をサーチしたが、この例に限定されず、低域部の各スケールファクターバンド先頭からその距離がｎに最も近い値となるピークをサーチしてもよい。
【０１８３】
図２２は、図１に示した第２の量子化部３４５によって生成される他の補助情報（コピー情報）の作成方法の第２の例を示すスペクトル波形図である。図２３は、図１に示した第２の量子化部３４５の他の補助情報（コピー情報）の第２の計算処理における動作を示すフローチャートである。
【０１８４】
第２の量子化部３４５は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき、そのスケールファクターバンド内の全スペクトルとのスペクトルの差分（エネルギー差）が最小となる低域部のスケールファクターバンドの番号Ｎを、以下の手順にしたがって特定する（Ｓ６１）。ただし、低域部において高域部との差分をとるスペクトルの個数は、高域部のスケールファクターバンド内のスペクトルの個数と等しくとり、特定されるスケールファクターバンドの番号Ｎは、そのスペクトルの先頭のスケールファクターバンドの番号とする。
【０１８５】
第２の量子化部３４５は、低域部のすべてのスケールファクターバンドにつき（Ｓ６２）、そのスケールファクターバンドの先頭から高域部のスケールファクターバンド内のスペクトルデータと同数のスペクトルデータからなる周波数の幅で、高域部のスペクトルと低域部のスペクトルとの差分を求める（Ｓ６３）。例えば、図２２に示す波形図において、高域部の最初のスケールファクターバンドが、スペクトルデータ数＝４８のスケールファクターバンドであったとすると、第２の量子化部３４５は、低域部の番号Ｎ＝１のスケールファクターバンドの先頭から４８個のスペクトルデータにつき、順次、高域部と低域部とのスペクトルの差分を求める。
【０１８６】
第２の量子化部３４５は、高域部のスケールファクターバンドと同数のスペクトルについて、高域部と低域部とのスペクトルの差分が求められると（Ｓ６５）、その値を保持し、次の低域部のスケールファクターバンドの先頭から、高域部のスケールファクターバンド内のスペクトルと同数のスペクトルデータの周波数の幅で、高域部スペクトルと低域部スペクトルとの差分を求める（Ｓ６４）。例えば、低域部の番号Ｎ＝１のスケールファクターバンドの先頭から４８個のスペクトルデータの幅で、スペクトルの差分が求められると、求められた差分の値を保持しておき、低域部の番号Ｎ＝２のスケールファクターバンドの先頭から４８個のスペクトルデータの幅で、スペクトルの差分を求める。以下同様に、低域部の番号Ｎ＝３のスケールファクターバンド、番号Ｎ＝４のスケールファクターバンド、・・・、番号Ｎ＝２８（低域部の最後）のスケールファクターバンドというように、低域部のすべてのスケールファクターバンドについて、順次、高域部と低域部との４８個のスペクトルデータ同士の差分を合計してスペクトルの差分を求める。
【０１８７】
低域部のすべてのスケールファクターバンドについて、そのスケールファクターバンドの先頭から、高域部のスケールファクターバンド内のスペクトルデータと同数のスペクトルデータの幅で、高域部スペクトルと低域部スペクトルとの差分が求められると（Ｓ６４）、第２の量子化部３４５は、求められた差分が最小となるスケールファクターバンドの番号Ｎを特定する（Ｓ６５）。例えば、図２２に示すスペクトル波形図において、低域部の番号Ｎ＝８のスケールファクターバンドが特定されたとする。このことは、低域部の斜線で示す部分のスペクトルは、高域部の斜線で示す部分のスペクトルとの差分が最も少なく、スペクトル同士のエネルギー差が最も小さいことを示している。すなわち、番号Ｎ＝８のスケールファクターバンドの先頭から４８個のスペクトルデータは、１１．２５ｋＨｚから始まる高域部の最初のスケールファクターバンドにコピーした場合、図２２の高域部に一点鎖線で示す波形となり、オリジナルのスペクトルに対して近似的に、高域部の当該スケールファクターバンド内のエネルギーを表すことができる。
【０１８８】
第２の量子化部３４５は、高域部のスケールファクターバンドにつき、スペクトルの差分が最小となる低域部スケールファクターバンドの番号Ｎを特定すると、特定されたスケールファクターバンドの番号Ｎを保持し、上記と同様にして、次の高域部のスケールファクターバンドにつき、該当するスケールファクターバンドの番号Ｎを特定する（Ｓ６６）。以下、高域部の各スケールファクターバンドにつき、順次この処理を繰り返し、すべての高域部のスケールファクターバンドにいて、スペクトルの差分が最小となる低域部スケールファクターバンドの番号Ｎを特定すると、保持していた低域部のスケールファクターバンドの番号Ｎを、高域部の補助情報（コピー情報）として第２の符号化部３５５に出力し、処理を終了する。
【０１８９】
なお、この場合、復号化装置４００における低域側スペクトルのコピー方法および振幅調整方法は、図２０と図２１とを用いて説明した補助情報（コピー情報）の場合と同様である。
【０１９０】
また、図２３のフローチャートでは高域部と低域部とのエネルギー差を計算する際に、同符号、かつ、周波数軸上の同方向に計算したが、本発明の符号化装置はこれに限定されず、図２０と図２１とを用いて説明したように、以下の３通りの方法のいずれかを用いて高域部と低域部とのエネルギー差を計算してもよい。▲１▼高域部と低域部との各スペクトルデータの値を、同符号で、かつ、低周波側から高周波側に向かって順次選択される高域部スペクトルデータに対し、低域部スケールファクターバンドの先頭から高域部と同数のスペクトルデータについて高周波側から低周波側に向かって（すなわち周波数軸上の逆方向に）スペクトルデータを順次選択し、差分を計算する。▲２▼低域部スペクトルの符号を反転し（マイナスをかけ）、かつ、周波数軸上の同方向に計算する。▲３▼低域部スペクトルの符号を反転し（マイナスをかけ）、かつ、周波数軸上の逆方向に計算する。また、これら４つのすべての方法で計算を行った後、これらのうちのエネルギー差が最小となる低域部スペクトルのスケールファクターバンドの番号Ｎを補助情報としてもよい。この場合には、エネルギー差が最小となる低域部スペクトルを高域部に正しくコピーするために、低域部スペクトルと高域部スペクトルとの符号の関係を示す情報と、高域部に低域部スペクトルをコピーする周波数軸上の方向を示す情報とを、スケールファクターバンドごとに補助情報に含める。低域部スペクトルと高域部スペクトルとの符号の関係を示す情報は、例えば、同符号で差分をとった場合を「１」、逆符号で差分をとった場合を「０」として１ビットで表される。また、低域部スペクトルを高域部にコピーする場合の周波数軸上の方向を示す情報は、例えば、順方向にコピーする場合、すなわち、高域部と低域部とにおいてスペクトルデータを選択する方向が順方向だった場合を「１」、逆方向にコピーする場合、すなわち、高域部と低域部とにおいてスペクトルデータを選択する方向が逆方向だった場合を「０」として１ビットで表される。
【０１９１】
なお、上記実施形態に係る音響データ配信システムの例を放送システムに適用した場合について説明したが、インターネット等の伝送媒体を介してサーバから端末にビットストリームで音響データ配信するような音響データ配信システムに適用してもよく、符号化装置３００から出力されたビットストリームをＣＤやＤＶＤ等の光ディスク、半導体、ハードディスク等の記録媒体に一旦記録し、この記録媒体を介して復号化装置４００で再生するような音響データ配信システムに適用してもよい。
【０１９２】
また、上記実施の形態では、ＬＯＮＧブロックで実施したが、ＳＨＯＲＴブロックで実施してもよく、このＳＨＯＲＴブロックにおいてもＬＯＮＧブロックで実施したのと同様の処理を行えばよい。
【０１９３】
また、符号化処理では、ＧａｉｎＣｏｎｔｒｏｌやＴＮＳ（ＴＥＭＰＯＲＡＬＮＯＩＳＥＳＨＡＰＩＮＧ）、聴覚心理モデル、Ｍ／ＳＳｔｅｒｅｏ、ＩｎｔｅｎｓｉｔｙＳｔｅｒｅｏ、Ｐｒｅｄｉｃｔｉｏｎ等のツール利用、およびブロックサイズの切り替え、ビットリザーバー等を使用してもよい。
【０１９４】
また、上記実施の形態では、データ分離部３３０により分離された高域部のスペクトルデータに基づいて補助情報を生成したが、第１の量子化部３４０の出力を逆量子化した値を高域部のスペクトルデータとし、この高域部のスペクトルデータに基づいて補助情報を生成してもよい。
【０１９５】
また、補助情報として、高域部の各スケールファクターバンド内のスペクトルデータの量子化値が「１」となるようなスケールファクターや、量子化値、特徴的なスペクトルの位置情報、スペクトルの正負の符号を表すサイン情報等で実施したが、これらを２つ以上組み合わせたものを補助情報としてもよい。この場合、補助情報内に、振幅の比率を表す係数や絶対最大スペクトルデータの位置などを前記スケールファクターと組み合わせて符号化すれば、特に有効である。また、本実施の形態においては、第２の符号化信号として、各スケールファクターバンドに１つの補助情報を符号化しているが、２つ以上のスケールファクターバンドごとに１つの補助情報を符号化してもよいし、１つのスケールファクターバンドに２つ以上の補助情報を符号化してもよい。また、本実施の形態における補助情報は、チャンネルごとに補助情報を符号化してもよいし、２つ以上のチャンネルに対して１つの補助情報を符号化してもよい。
【０１９６】
また、本実施の形態においては、符号化装置３００における量子化部および符号化部はそれぞれ２つとしたが、これに限定されるものではなく、３つ以上の量子化部および復号化部を備えてもよい。
【０１９７】
また、本実施の形態においては、復号化装置４００における復号化部および逆量子化部はそれぞれ２つとしたが、これに限定されるものではなく、３つ以上の復号化部および逆量子化部を備えてもよい。
【０１９８】
また、以上の処理は、ハードウェアはもちろん、ソフトウェアでも実現でき、一部をハードウェア、残りをソフトウェアで実現するという構成でもよい。
また、上記実施の形態では、サンプリング周波数を４４．１ｋＨｚで実施したが、３２ｋＨｚ，４８ｋＨｚ等で実施してもよく、また、データ分離部３３０における分離する境界の周波数を１１．０２５ｋＨｚ以外の周波数に変更してもよい。
【０１９９】
さらに、上記実施の形態ではＡＡＣについて実施したが、他の方式（例えば、ＭＰ３、ＡＣ３等）の符号化装置および復号化装置等においても同様に実施してもよい。
【０２００】
また、符号化装置を以下のように構成してもよい。
すなわち、音響データを符号化する符号化装置であって、音響データ列から、要求数ｍ１を超える個数ｍ２の音響データを連続して切り出す切り出し手段と、前記切り出す切り出し手段によって切り出された音響データを周波数軸上のスペクトルデータに変換する変換手段と、前記変換で得られたｍ２個のスペクトルデータを前記ｍ１個の低域部スペクトルデータと（ｍ２−ｍ１）個の高域部スペクトルデータとに分離する分離手段と、分離された低域部スペクトルデータを量子化し、符号化する低域部符号化手段と、分離された高域部スペクトルデータから、当該高域部周波数スペクトルの特徴を示す補助情報を生成する補助情報生成手段と、生成された補助情報を符号化する高域部符号化手段と、前記低域部符号化手段で得られた符号と前記高域部符号化手段で得られた符号とを合成して出力する出力手段とを備えることを特徴としてもよい。
【０２０１】
この場合、前記補助情報生成手段は、複数のグループに分けられた前記スペクトルデータにつき、高域部の各グループにおいて、ピークとなるスペクトルデータを量子化した値が一定値となる正規化係数を計算し、計算された正規化係数を前記補助情報として生成することを特徴とする構成とすることができる。
【０２０２】
また、前記補助情報生成手段は、複数のグループに分けられた前記スペクトルデータにつき、高域部の各グループにおいてピークとなるスペクトルデータを、前記各グループに共通の正規化係数を用いて量子化し、その量子化値を前記補助情報として生成することを特徴とする構成とするができる。
【０２０３】
また、前記補助情報生成手段は、複数のグループに分けられた前記スペクトルデータにつき、高域部の各グループにおいてピークとなるスペクトルデータの周波数位置を前記補助情報として生成することを特徴とする構成とするができる。
【０２０４】
また、前記スペクトルデータはＭＤＣＴ係数であって、前記補助情報生成手段は、複数のグループに分けられた前記スペクトルデータにつき、高域部の所定周波数位置におけるスペクトルデータの正負を示す符号を前記補助情報として生成することを特徴とする構成とするができる。
【０２０５】
また、前記補助情報生成手段は、複数のグループに分けられた前記スペクトルデータにつき、高域部の各グループにおいて、当該グループ内のスペクトルと最も近似する低域部のスペクトルを特定する情報を前記補助情報として生成することを特徴とする構成とするができる。この場合、前記補助情報生成手段は、前記グループの区切りからそのグループにおける高域部スペクトルのピークまでの周波数軸上の距離と、低域部のグループの区切りから低域部スペクトルのピークまでの周波数軸上の距離との差が最小となる低域部スペクトルを特定することを特徴とする構成とするができる。また、前記補助情報生成手段は、前記高域部のグループ内のスペクトルと同じ周波数幅でエネルギーの差分をとったときの差分値が最小となる低域部のスペクトルを特定することを特徴とする構成とするができる。また、前記低域部のスペクトルを特定する情報は、特定された低域部スペクトルの前記グループを特定する番号であることを特徴とする構成とするができる。
【０２０６】
また、前記補助情報生成手段は、前記高域部スペクトルの振幅の増幅比率を表すあらかじめ定めた係数を前記補助情報として生成することを特徴とする構成とするができる。
【０２０７】
また、前記出力手段は、さらに、前記低域部符号化手段によって符号化されたデータを所定のフォーマットに定められた符号化音響ストリームに変換するとともに、前記符号化音響ストリーム内の領域であって、符号化規約では使用が制約されていない領域に、前記高域部符号化手段によって符号化されたデータを格納して出力するストリーム出力部を備えることを特徴とする構成とするができる。この場合、前記ストリーム出力部は、サンプリング周波数としてｆ１Ｈｚを表す情報を書き込むことを特徴とする構成とするができる。
【０２０８】
さらに、前記出力手段は、さらに、前記低域部符号化手段によって符号化されたデータを所定のフォーマットに定められた符号化音響ストリームに変換するとともに、前記高域部符号化手段によって符号化されたデータを、前記符号化音響ストリームとは異なるストリームに格納して出力する第２ストリーム出力部を備えることを特徴とする構成とするができる。
【０２０９】
なお、この変形例の符号化装置と復号化装置とからなる通信システムとして実現したり、上記符号化装置および通信システムを構成する特徴的な手段をステップとする符号化方法、通信方法として実現したり、上記符号化装置を構成する特徴的な手段やステップをＣＰＵに実行させる符号化プログラムとして実現したり、これらプログラムが記録されたコンピュータ読み取り可能な記録媒体として実現したりすることができるのはいうまでもない。
【０２１０】
【発明の効果】
以上の説明から明らかなように、本発明に係る符号化装置は、音響データを符号化する符号化装置であって、音響データ列から、連続する一定個数の音響データを切り出す切り出し手段と、切り出された音響データを周波数軸上のスペクトルデータに変換する変換手段と、前記変換で得られたスペクトルデータを前記ｆ１Ｈｚまでの低域部スペクトルデータと前記ｆ１Ｈｚよりも高い高域部スペクトルデータとに分離する分離手段と、分離された低域部スペクトルデータを量子化し、符号化する低域部符号化手段と、分離された高域部スペクトルデータから、当該高域部周波数スペクトルの特徴を示す補助情報を生成する補助情報生成手段と、生成された補助情報を符号化する高域部符号化手段と、前記低域部符号化手段で得られた符号と前記高域部符号化手段で得られた符号とを合成して出力する出力手段とを備え、前記ｆ１は、前記音響データ列が作成されたときのサンプリング周波数ｆ２の半分以下であることを特徴とする。
【０２１１】
本発明の符号化装置において、変換手段は、切り出し手段により切り出された音響データ列からｆ１以下の多数の低域部スペクトルデータを出力すると同時に、ｆ１を超える高域部スペクトルデータを出力する。そして、分離手段により分離された低域部のスペクトルデータは量子化、符号化され、高域部のスペクトルデータについては、周波数の高域部の特徴を表す補助情報に符号化され、高域部符号化手段は、生成された前記補助情報を符号化する。
【０２１２】
したがって、情報量の合計が、大幅に増加しない範囲で、低域部についてはダウンサンプリングと同等で、しかも高域部を再現可能に音響信号を高品質の符号化することができる。
【０２１３】
ここで、前記ｆ１は、ｆ２／４であり、前記変換手段は、前記音響データを０〜２×ｆ１Ｈｚのスペクトルデータに変換し、前記分離手段は、０〜ｆ１Ｈｚの低域部スペクトルデータとｆ１〜２×ｆ１Ｈｚの高域部スペクトルデータとに分離することを特徴としたり、前記ｆ１Ｈｚまでの低域部スペクトルデータは、ｎ個のスペクトルデータから構成され、前記切り出し手段は、２×ｎ個のスペクトルデータを生成するのに必要な個数の音響データを切り出し、前記変換手段は、切り出された音響データを２×ｎ個のスペクトルデータに変換し、前記分離手段は、ｎ個の低域部スペクトルデータとｎ個の高域部スペクトルデータとに分離することを特徴としたり、前記切り出し手段は、符号化の単位である１フレームに相当するｎ個の音響データと、そのフレームに隣接する２つのフレームそれぞれに属するｎ／２個ずつの音響データとを併せた２×ｎ個のスペクトルデータを切り出し、前記変換手段は、切り出された２×ｎ個の音響データに対してＭＤＣＴによって前記変換を行い、２×ｎ個のスペクトルデータからなる０〜２×ｆ１Ｈｚのスペクトルに変換することを特徴としたりする構成としてもよい。
【０２１４】
さらに、本発明に係る復号化装置は、記録媒体または伝送媒体を介して入力された符号化データを復号化する復号化装置であって、符号化データに含まれる低域部符号化データと高域部符号化データとをそれぞれ抽出する抽出手段と、前記抽出手段により抽出された低域部符号化データを復号化し、逆量子化することにより、周波数ｆ１以下の低域部のスペクトルデータを出力する低域部逆量子化手段と、前記抽出手段により抽出された高域部データを復号化することにより、高域部スペクトルデータの特徴を表す補助情報を生成する補助情報復号化手段と、前記補助情報復号化手段により生成された補助情報に基づいて高域部のスペクトルデータを出力する高域部逆量子化手段と、前記低域部逆量子化手段によって出力された低域部スペクトルデータと、前記高域部逆量子化手段によって出力された高域部スペクトルデータとを合成する合成手段と、前記合成手段により合成されたスペクトルデータを時間軸上の音響データに逆変換する逆変換手段と、逆変換手段により逆変換された音響データを時間順に出力する音響データ出力手段とを備えることを特徴とする。
【０２１５】
本発明の復号化装置において、抽出手段は、入力された符号化データから低域部の符号化データと高域部の符号化データとを抽出し、低域部逆量子化手段は周波数ｆ１以下の低域部のスペクトルデータを出力する。補助情報復号化手段は補助情報を復号化し、高域部逆量子化手段は補助情報に基づいて高域部のスペクトルデータを出力する。したがって、従来とほぼ同じわずかな情報量から、従来と比べて大幅に増加した情報量を復号化することができ、高音質の音響信号の復号化をすることができる。
【図面の簡単な説明】
【図１】本発明の実施の形態における放送システムの機能構成を示すブロック図である。
【図２】図１に示した符号化装置において処理される音響信号の状態変化を示す図である。
【図３】図１に示した第１の量子化部のスケールファクター決定処理における動作を示すフローチャートである。
【図４】図１に示した第１の量子化部のスケールファクター決定処理における他の動作を示すフローチャートである。
【図５】図１に示した第２の量子化部によって生成される補助情報（スケールファクター）の具体例を示すスペクトル波形図である。
【図６】図１に示した第２の量子化部の補助情報（スケールファクター）計算処理における動作を示すフローチャートである。
【図７】図１に示したストリーム出力部によって補助情報が格納されるビットストリーム中の位置を示す図である。
【図８】図１に示したストリーム出力部が補助情報を格納する場合の他の例を示す図である。
【図９】図１に示した符号化装置と従来例１との処理を比較して示す図である。
【図１０】図１に示した符号化装置と従来例２との処理を比較して示す図である。
【図１１】図１に示した符号化装置と従来例１，２とのスペクトルデータおよび特徴について比較して示す図である。
【図１２】図１に示した第２の逆量子化部によって低域部１０２４スペクトルが順方向に高域部にコピーされる手順を示すフローチャートである。
【図１３】図１に示した第２の逆量子化部によって低域部１０２４スペクトルが周波数軸方向の逆方向に高域部にコピーされる手順を示すフローチャートである。
【図１４】図１に示した第２の量子化部によって生成される他の補助情報（量子化値）の具体例を示すスペクトル波形図である。
【図１５】図１に示した第２の量子化部の他の補助情報（量子化値）計算処理における動作を示すフローチャートである。
【図１６】図１に示した第２の量子化部によって生成される他の補助情報（位置情報）の具体例を示すスペクトル波形図である。
【図１７】図１に示した第２の量子化部の他の補助情報（位置情報）計算処理における動作を示すフローチャートである。
【図１８】図１に示した第２の量子化部によって生成される他の補助情報（サイン情報）の具体例を示すスペクトル波形図である。
【図１９】図１に示した第２の量子化部の他の補助情報（サイン情報）計算処理における動作を示すフローチャートである。
【図２０】図１に示した第２の量子化部によって生成される他の補助情報（コピー情報）の作成方法の一例を示すスペクトル波形図である。
【図２１】図１に示した第２の量子化部の他の補助情報（コピー情報）計算処理における動作を示すフローチャートである。
【図２２】図１に示した第２の量子化部によって生成される他の補助情報（コピー情報）の作成方法の第２の例を示すスペクトル波形図である。
【図２３】図１に示した第２の量子化部の他の補助情報（コピー情報）の第２の計算処理における動作を示すフローチャートである。
【図２４】従来のＡＡＣ方式による符号化装置および復号化装置の構成を示すブロック図である。
【符号の説明】
１放送システム
３００符号化装置
３０５Ａ／Ｄ変換器
３１０音響データ入力部
３２０変換部
３３０データ分離部
３４０第１の量子化部
３４５第２の量子化部
３５０第１の符号化部
３５５第２の符号化部
３９０ストリーム出力部
４００復号化装置
４１０ストリーム入力部
４２０第１の復号化部
４２５第２の復号化部
４３０第１の逆量子化部
４３５第２の逆量子化部
４４０逆量子化データ合成部
４８０逆変換部
４９０音響データ出力部
４９５Ｄ／Ａ変換器[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a high-quality compression encoding and decompression decoding technique for acoustic signals.
[0002]
[Prior art]
In recent years, various types of techniques for high-quality compression encoding and decompression decoding of audio signals such as voice and music have been developed, and are referred to as “MPEG-2 Advanced Audio Coding” (hereinafter referred to as “MPEG-2 AAC” or “AAC”). "Is abbreviated as") "is one of the methods (see Non-Patent Document 1).
[0003]
[Non-Patent Document 1]
M.M. Bosi et al., “IS 13818-7 (MPEG-2 Advanced Audio Coding, AAC)”, April 1997.
FIG. 24 is a block diagram illustrating a functional configuration of a conventional AAC encoding apparatus and decoding apparatus.
[0004]
The encoding apparatus 1000 is an apparatus that compresses and encodes an input acoustic signal based on an AAC encoding scheme, and includes an A / D converter 1050, an acoustic data input unit 1100, a conversion unit 1200, a quantization unit 1400, It comprises an encoding unit 1500 and a stream output unit 1900.
[0005]
The A / D converter 1050 samples the input acoustic signal at a sampling frequency of, for example, 22.05 kHz, and converts the analog acoustic signal into a digital acoustic data string. Each time the acoustic data input unit 1100 reads an acoustic data string as an input signal by 1024 samples (this 1024 samples are hereinafter referred to as “frames”), 50% of the frame and adjacent frames before and after that frame are read. A total of 2048 samples of acoustic data strings that overlap the sample (512) are cut out. The conversion unit 1200 converts the data of 2048 samples on the time axis extracted by the acoustic data input unit 1100 into spectrum data on the frequency axis by MDCT (Modified Discrete Case Transform). Note that 1024 points of spectrum data, which is half of the data obtained by the conversion, represent a reproduction band of half the sampling frequency of 11.025 kHz or less, and are classified into a plurality of groups. Each group is set so that each of the plurality of groups includes one or more points of spectral data. In addition, each group simulates a critical band in human hearing. Each group is called a “scale factor band”. The quantization unit 1400 quantizes the spectrum data in the scale factor band obtained from the conversion unit 1200 into a predetermined number of bits using one normalization coefficient for each scale factor band. This normalization coefficient is called “scale factor”. The result of quantizing each spectrum data with each scale factor is referred to as a “quantized value”. The encoding unit 1500 performs Huffman encoding on the data quantized by the quantization unit 1400, that is, each scale factor and spectrum data quantized using the scale factor. The stream output unit 1900 converts the encoded signal obtained from the encoding unit 1500 into the stream format of the AAC bit stream and outputs it. The bit stream output from the encoding apparatus 1000 is transmitted to the decoding apparatus 2000 via a transmission medium or a recording medium.
[0006]
The decoding device 2000 is a device that decodes the bitstream encoded by the encoding device 1000, and includes a stream input unit 2100, a decoding unit 2200, an inverse quantization unit 2300, an inverse transform unit 2800, and an acoustic data output. Part 2900 and a D / A converter 2950.
[0007]
The stream input unit 2100 inputs the bit stream encoded by the encoding apparatus 1000 via a transmission medium or a recording medium, and extracts an encoded signal from the input bit stream. The decoding unit 2200 decodes the Huffman-coded encoded signal into quantized data. The inverse quantization unit 2300 performs inverse quantization on the quantized data decoded by the decoding unit 2200 using a scale factor. The inverse transform unit 2800 transforms the spectrum data of 1024 points on the frequency axis obtained by the inverse quantization unit 2300 into acoustic data of 1024 samples on the time axis using an IMDCT (Inverse Modified Discrete Cosine Transform). . The acoustic data output unit 2900 sequentially combines the acoustic data of 1024 samples on the time axis obtained by the inverse transform unit 2800, and outputs the acoustic data of 1024 samples one by one in time order. The D / A converter 2950 converts digital sound data into an analog sound signal at a sampling frequency of 22.05 kHz.
[0008]
According to the encoding apparatus 1000 and the decoding apparatus 2000 according to such a conventional AAC standard, the compression rate can be increased to 1 bit or less for each point, and reproduction at half the sampling frequency of 11.025 kHz or less. Since the spectral data of 1024 points of the low-frequency portion that represent the band and have high auditory priority are encoded, it is possible to reproduce the acoustic signal with relatively high sound quality.
[0009]
[Problems to be solved by the invention]
However, according to the conventional AAC encoding apparatus 1000 and decoding apparatus 2000 (conventional example 1), since the sampling frequency is 22.05 kHz, there is absolutely no band exceeding 11.25 kHz in the encoded spectral data. Not included. For this reason, there exists a problem that it cannot respond to the request | requirement of the further quality improvement which wants to view and listen to the zone | band exceeding 11.025kHz.
[0010]
In order to solve such a problem, the sampling frequency applied to the A / D converter 1050 and the D / A converter 2950 of the encoding device 1000 and the decoding device 2000 in FIG. It is conceivable to increase the frequency to 1 kHz (hereinafter, this method is also referred to as “conventional example 2”).
[0011]
However, if the sampling frequency is 44.1 kHz, 512 points of spectral data can be encoded in the high frequency region exceeding 11.25 kHz while maintaining the compression ratio, but the auditory priority is low. The spectral data of the part is halved to 512 points. That is, there is a trade-off relationship between the sampling frequency and the number of spectra in the low frequency region, and it is impossible to increase both at the same time with the conventional AAC. This causes another problem that the sound quality as a whole deteriorates.
[0012]
Such a situation is the same in encoding apparatuses and decoding apparatuses of other systems (for example, MP3, AC3, etc.).
The present invention has been made to solve the above-described technical problem, and an encoding device and a decoding that can realize high-quality reproduction of an acoustic signal without significantly increasing the amount of information after encoding. It is to provide a device or the like.
[0013]
[Means for Solving the Problems]
In order to solve the above problems, an encoding apparatus according to the present invention is an encoding apparatus that encodes acoustic data, and includes an extraction unit that extracts a predetermined number of continuous acoustic data from an acoustic data sequence, and The conversion means for converting the acoustic data into spectral data on the frequency axis, and the spectral data obtained by the conversion are separated into low-frequency spectral data up to f1 Hz and high-frequency spectral data higher than f1 Hz. Separation means, low-band coding means for quantizing and coding the separated low-frequency spectrum data, and auxiliary information indicating the characteristics of the high-frequency spectrum from the separated high-frequency spectrum data Auxiliary information generating means for generating, a high frequency band encoding means for encoding the generated auxiliary information, a code obtained by the low frequency band encoding means, and the high frequency Output means for synthesizing and outputting the code obtained by the partial encoding means, wherein f1 is less than or equal to half the sampling frequency f2 when the acoustic data sequence is created, and the auxiliary information generating means Is Respectively Divided into groups High region Spectral data And the low-frequency spectrum data For each group in the high frequency region, it is closest to the spectrum in that group. With spectrum Low range Spectral data of group Is generated as the auxiliary information.
[0014]
Here, f1 is f2 / 4, the converting means converts the acoustic data into spectral data of 0 to 2 × f1 Hz, and the separating means includes low-frequency spectral data of 0 to f1 Hz and f1 ˜2 × f1 Hz high-frequency spectrum data, or the low-frequency spectrum data up to f1 Hz is composed of n spectral data, and the clipping means is 2 × n The number of pieces of acoustic data necessary to generate spectrum data is cut out, the converting unit converts the cut out acoustic data into 2 × n pieces of spectral data, and the separating unit uses n pieces of low-frequency spectrums. Data and n high-frequency spectrum data are separated, or the cut-out means includes n acoustic data corresponding to one frame which is a unit of encoding. And 2 × n pieces of spectrum data that are combined with n / 2 pieces of sound data belonging to each of two frames adjacent to the frame, and the converting means cuts out 2 × n pieces of sound data. Alternatively, the conversion may be performed by MDCT and converted to a spectrum of 0 to 2 × f1 Hz including 2 × n spectrum data.
[0015]
Furthermore, a decoding device according to the present invention is a decoding device that decodes encoded data input via a recording medium or a transmission medium, and includes a low-band encoded data included in the encoded data and a high-frequency encoded data. Extraction means for extracting each of the band encoded data, and the low band encoded data extracted by the extraction means are decoded and inverse quantized to output spectrum data of the low band below the frequency f1 Low-frequency dequantizing means, and auxiliary information decoding means for generating auxiliary information representing characteristics of high-frequency spectrum data by decoding high-frequency data extracted by the extracting means, A high frequency band inverse quantization means for outputting high band spectrum data based on the auxiliary information generated by the auxiliary information decoding means, and a low frequency band spectrum output by the low frequency band inverse quantization means. Synthesis means for synthesizing the data and the high-frequency spectrum data output by the high-frequency inverse quantization means, and inverse transform for inversely transforming the spectrum data synthesized by the synthesis means into acoustic data on the time axis Means and acoustic data output means for outputting the acoustic data inversely transformed by the inverse transform means in time order, the auxiliary information is Respectively Divided into several groups The high frequency part Spectral data And the low-frequency spectrum data For each group in the high frequency region, it is closest to the spectrum in that group. With spectrum Low range Spectral data of group The high frequency band inverse quantization means generates a predetermined noise in each group of the high frequency band based on the auxiliary information, and adds it to the spectrum data to add the high frequency band spectrum data. Is generated.
[0016]
Note that the present invention can be realized as a communication system including the encoding device and the decoding device, or an encoding method including steps as characteristic means constituting the encoding device, the decoding device, and the communication system, It can be realized as a decoding method or a communication method, or it can be realized as an encoding program or decoding program for causing a CPU to execute characteristic means and steps constituting the encoding device and the decoding device. Needless to say, the present invention can be realized as a computer-readable recording medium.
[0017]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, a case where an embodiment of the present invention is applied to a broadcasting system as an acoustic data distribution system will be described in detail with reference to the drawings.
[0018]
FIG. 1 is a block diagram showing a functional configuration of the broadcasting system according to the present embodiment.
The broadcasting system 1 according to the present embodiment shown in the figure is provided in a broadcasting station, and is provided in an encoding device 300 that encodes an input acoustic signal and a user terminal. And a decoding device 400 for decoding the converted bitstream audio signal.
[0019]
(Encoding device 300)
The encoding device 300 is an encoding device that encodes an input acoustic signal, and includes an A / D converter 305, an acoustic data input unit 310, a conversion unit 320, a data separation unit 330, and a first And second quantization units 340 and 345, first and second encoding units 350 and 355, and a stream output unit 390.
[0020]
The A / D converter 305 samples the input acoustic signal at, for example, a sampling frequency 44.1 kHz that is twice that of the conventional example 1, and converts the analog acoustic signal into digital acoustic data (for example, 16 bits). The acoustic data string on the time axis is generated.
[0021]
The acoustic data input unit 310 is time-expanded to a cycle (about 45.4 msec) every time 2048 samples (2 frames) are received by the acoustic data sequence generated by the A / D converter 305, that is, twice as long as the conventional example 2. In a slow cycle, 2048 samples of these two frames and 50% of adjacent frames before and after these frames, an acoustic data string that overlaps 1024 samples, that is, twice the number of conventional (4096 samples). An acoustic data string is extracted, and includes a counter 311 for detecting the extraction timing every time 2048 samples are received, and a FIFO buffer 312 that temporarily stores an acoustic data string of 4096 samples.
[0022]
The converting unit 320 converts the acoustic data of 4096 samples for two frames cut out by the acoustic data input unit 310 into spectral data on the frequency axis, and 4096 points of acoustic data of the 4096 samples on the time axis. MDCT 321 for converting the spectrum data into spectrum data on the frequency axis, and a grouping unit 322 for grouping the spectrum data by a scale factor band.
[0023]
Specifically, the MDCT 321 converts the acoustic data of 4096 samples on the time axis into 4096 points of spectral data (16 bits). However, since it becomes two groups of symmetrical spectral data, the spectrum of one 2048 samples Only the data is to be encoded, and the other is discarded.
[0024]
As described above, when the configurations of the A / D converter 305, the acoustic data input unit 310, and the conversion unit 320 of the encoding device 300 are compared with those of the encoding device 1000 of the conventional example 1 described above, the A / D converter 305 is compared. The sampling frequency of the conversion unit 320 is increased to twice the frequency (44.1 kHz), the cut-out length in the acoustic data input unit 310 is increased to twice (4096 samples), and the encoding unit in the MDCT 321 of the conversion unit 320 is doubled (4096 samples). It is greatly different in that it has been raised.
[0025]
Compared to the configuration of the encoding apparatus 1000 of the above-described conventional example 2, the sampling frequency in the A / D converter 305 is the same, but the cut-out length in the acoustic data input unit 310 is doubled (4096 samples). The encoding unit in the MDCT 321 of the conversion unit 320 is greatly different in that it is doubled (encoding unit 4096 samples).
[0026]
As a result, from the conversion unit 320, there are 1024 pieces of spectrum data belonging to the low frequency band of 11.025 kHz or lower (hereinafter referred to as “low band spectral data”) belonging to the high frequency band exceeding 11.025 kHz. 1024 pieces of spectrum data (hereinafter referred to as “spectral data in the high frequency region”) are output, and a total of 2048 pieces of spectrum data are output.
[0027]
The grouping unit 322 of the conversion unit 320 converts the spectral data of one 2048 sample that is the target of encoding into a plurality of scale factor bands each including spectral data of one sample or more (practically a multiple of 4). Classify.
[0028]
In this AAC standard, the number of samples (spectral data) included in each scale factor band is determined according to the frequency, and the scale factor band is finely divided for each small number of samples in the low frequency range. The section is largely divided to include many samples. In AAC, the number of scale factor bands corresponding to one frame of spectrum data is also determined according to the sampling frequency. For example, when the sampling frequency is 44.1 kHz, the number of scale factor bands included in one frame is 49, and spectrum data of 1024 samples is included in the 49 scale factor bands. On the other hand, among the scale factor bands determined as described above, which scale factor band is transmitted is not particularly defined by the AAC, and the most preferable scale factor band is selected according to the transfer rate of the transmission path. It may be transmitted. For example, when the transfer rate of the transmission path is 96 kbps, only the low band 40 scale factor band (640 samples) of one frame may be selected and transmitted.
[0029]
On the other hand, in the present embodiment, the spectrum data of 2 frames (1024 low-frequency spectrum data and 1024 high-frequency spectrum data) is twice as long as the conventional cycle (about 45.4 msec). ) Is output from the MDCT 321. For this reason, when the transfer rate of the transmission path is 96 kbps, even if all the scale factor bands (1024 samples) in the low frequency part of the two frames are to be transmitted, compared to the data transfer for two frames according to the conventional method, AAC 2 times (640 × 2 = 1280 samples), a sufficient margin is generated in the transfer rate. Therefore, in the present embodiment, a case will be described in which the grouping unit 322 classifies the converted spectrum data into uniquely defined division methods and numbers of scale factor bands.
[0030]
The data separation unit 330 separates the 2048 spectral data output from the conversion unit 320 into low-frequency spectral data (1024) and high-frequency spectral data (1024). Then, the data separation unit 330 sends the separated low-band spectrum data (1024 pieces) to the first quantization unit 340 and the high-band spectrum data (1024 pieces) to the second quantization unit 345, respectively. Output.
[0031]
The first quantization unit 340 determines the scale factor for each of the scale factor bands from the low-frequency spectrum data transferred from the data separation unit 330, and the spectrum in the scale factor band is determined by the determined scale factor. Quantization is performed, and the quantization value that is the quantization result, the head of the determined scale factor, and the difference between the scale factors are output to the first encoding unit 350, and the scale factor calculation unit 341 is provided. For example, according to the formula, the scale factor calculation unit 341 calculates one normalization coefficient (scale factor, 8 bits) so that the spectrum data in the band falls within a predetermined number of bits for each scale factor band. The scale factor is used to quantize each spectrum in the scale factor band and calculate the scale factor difference.
[0032]
The first encoding unit 350 encodes the data quantized by the first quantization unit 340, the scale factor for each scale factor band, and the like into a predetermined stream format. A Huffman coding table 351 for further compressing each piece of data and each scale factor is provided. Specifically, Huffman coding is performed so that each data quantized using the Huffman coding table 351, each scale factor, and the like are transmitted at a low bit rate.
[0033]
The second quantizing unit 345 calculates auxiliary information based on the spectral data of the band not quantized by the first quantizing unit 340, that is, the high-frequency part data of 11.0525 kHz or higher output from the data separating unit 330. And includes an auxiliary information generation unit 346 for generating auxiliary information. Here, the auxiliary information is simplified information that is calculated based on the spectrum data of the high frequency region and briefly represents the characteristics of the spectrum data of the high frequency region with a small amount of information. In other words, it is information representing the characteristics of the high frequency part of the spectrum data obtained by converting the input acoustic data for a certain period of time, and a specific example is absolute within the scale factor band of the high frequency part. This is an optimum scale factor for each scale factor band and its quantized value so that the quantized value of the maximum spectral data (spectrum data having the maximum absolute value) is “1”.
[0034]
The second encoding unit 355 encodes the auxiliary information output from the second quantization unit 345 into a predetermined stream format and outputs the encoded information as second encoded information. A Huffman coding table 356 is provided.
[0035]
The stream output unit 390 adds header information and other sub information as necessary to the first encoded signal output from the first encoding unit 350, and converts the first encoded signal into an AAC encoded bit stream as usual. The second encoded signal output from the second encoding unit 355 is stored in an area in the bit stream that is ignored by the conventional decoding device or whose operation is not defined. Specifically, the stream output unit 390 stores the encoded signal output from the second encoding unit 355 in a Fill Element, a Data Stream Element, or the like in an AAC encoded bit stream.
[0036]
However, the information indicating the sampling frequency of the bit stream stored in the header information stores a value that is half the sampling frequency of the acoustic data. That is, when the sampling frequency of the acoustic data is 44.1 kHz, the actual half 22.02 kHz information is stored in the header. The information indicating the actual sampling frequency 44.1 kHz may be stored in an area where the auxiliary information is stored.
[0037]
The bit stream output from the encoding device 300 is transmitted to the decoding device 400 via a transmission medium such as the Internet configured by radio waves, light, metal, or the like.
[0038]
Thus, when the encoding device 300 quantizes and encodes the spectrum data on the frequency axis obtained by the conversion unit 320, the data separation unit 330 performs low-frequency spectrum data (1024 points), and Separated into high-frequency spectrum data (1024 points), low-frequency spectrum data is quantized and encoded in the same manner as before, and high-frequency spectrum data is quantized in a different method And encoding (generating auxiliary information, encoding auxiliary information), and encoding the low-frequency encoded bitstream into the high-frequency encoded bitstream for output. Is significantly different from the conventional coding apparatus 1000 that performs the same method over the entire band.
[0039]
As a result, it is possible to encode a high-quality acoustic signal within a range where the total amount of information does not increase significantly compared to the conventional case.
In addition, since the information of the sampling frequency 22.05 kHz is stored in the header, there is an effect that the conventional decoding apparatus 2000 can also decode the bitstream generated by the encoding apparatus 300 in the present embodiment. .
[0040]
(Decryption device 400)
Decoding apparatus 400 according to the present embodiment performs a process substantially reverse to that of encoding apparatus 300 on the bit stream output from encoding apparatus 300 to generate an acoustic signal (reproduction upper limit frequency 22.05 kHz) on the time axis. A stream input unit 410, first and second decoding units 420 and 425, first and second inverse quantization units 430 and 435, and an inverse quantized data synthesis unit 440; , An inverse conversion unit 480, an acoustic data output unit 490, and a D / A converter 495.
[0041]
The stream input unit 410 inputs the bit stream encoded by the encoding device 300 via a transmission medium, and the first code stored in the area used by the conventional decoding device from the input bit stream. The first decoding unit 420 and the second decoding unit 425 are respectively extracted from the encoded signal and the second encoded signal stored in an area where the conventional decoding device is not ignored or specified for operation. And output respectively.
[0042]
The first decoding unit 420 receives the first encoded signal output from the stream input unit 410, decodes the first encoded signal from the stream format into quantized data, and performs Huffman decoding for decoding A conversion table 421 is provided.
[0043]
The first dequantization unit 430 dequantizes the quantized data decoded by the first decoding unit 420 and outputs spectrum data. The first dequantization unit 430 dequantizes the quantized data based on a formula. The processing part 431 for performing this is provided. Here, the number of samples of the spectrum data output from the first inverse quantization unit 430 is 1024, and these represent a reproduction band of 11.025 kHz or less.
[0044]
The second decoding unit 425 receives the second encoded signal output from the stream input unit 410 and decodes auxiliary information, and includes a Huffman decoding table 426 for decoding the auxiliary information. .
[0045]
The second inverse quantization unit 435 generates high-frequency spectrum data based on the auxiliary information, and includes a spectrum data generation unit 436. Here, the number of samples of the spectrum data output from the second inverse quantization unit 435 is 1024, and these represent a reproduction band exceeding 11.025 kHz.
[0046]
The spectrum data generation unit 436 generates noise in a predetermined procedure based on the spectrum data output from the first inverse quantization unit 430, and also includes auxiliary information output from the second decoding unit 425. Then, the above-mentioned noise is shaped and high-frequency spectrum data is output. This noise includes white noise, pink noise, and the like, as well as spectral data obtained by partially or entirely copying the low-frequency spectrum data.
[0047]
Specifically, the spectrum data generation unit 436, for example, copies the low-frequency part spectrum data output from the first inverse quantization unit 430 to the high-frequency part, and each scale factor band of the high-frequency part. The ratio between the absolute maximum value of the spectrum data copied in the band and the value obtained by dequantizing the quantized value “1” using the scale factor value corresponding to the band described in the auxiliary information By multiplying each spectrum data in the band as a coefficient, the spectrum in the high frequency part is restored.
[0048]
The inverse quantized data synthesizer 440 synthesizes the spectrum data output from the first inverse quantizer 430 and the spectrum data output from the second inverse quantizer 435. Here, the number of samples of spectrum data output from the inverse quantized data combining unit 440 is 2048, which represents a reproduction band of 0 to 22.05 kHz.
[0049]
As described above, the decoding apparatus 400 includes the first (low frequency band) encoded signal stored in the area used by the conventional decoding apparatus, from the bitstream encoded by the encoding apparatus 300. The conventional decoding apparatus separates each of the second (high frequency band) encoded signals stored in an area where the ignoring or the operation is not specified, and only the first (low frequency band) encoded signals. Is decoded and dequantized in the same manner as in the past, and the second (high band) encoded signal is decoded and dequantized in a scheme different from the conventional one, and the low band spectrum data and the high band The configuration is greatly different from the decoding apparatuses 2000 of the conventional examples 1 and 2 that perform the decoding and inverse quantization of the bitstream in the same manner over the entire band in that they are synthesized and output.
[0050]
As a result, it is possible to decode the amount of information that is significantly increased compared to the conventional amount from the same small amount of information as in the past, and to decode a high-quality sound signal.
The inverse transform unit 480 transforms the spectrum data on the frequency axis output from the inverse quantized data synthesis unit 440 into acoustic data on the time axis of 2048 samples (2 frames) using IMDCT.
[0051]
The acoustic data output unit 490 sequentially combines the acoustic data of 2048 samples on the time axis obtained by the inverse transform unit 480, and outputs the acoustic data of 2048 samples one by one in time order.
[0052]
The D / A converter 495 converts digital sound data into an analog sound signal at a sampling frequency of 44.1 kHz.
As described above, in the decoding apparatus 400, the inverse transform unit in the inverse transform unit 480 is doubled (2048 samples), the frame length in the acoustic data output unit 490 is doubled (2048 samples), and D The configuration of the decoding apparatus 2000 of the conventional example 1 is greatly different in that the sampling frequency in the / A converter 495 is increased to a double frequency (44.1 kHz).
[0053]
As a result, from the D / A converter 495, a high band (0 to 22.2) based on low-frequency spectrum data (1024) of 11.25 kHz or less and high-frequency spectrum data (1024). (05 kHz) and a high-quality acoustic signal is output.
[0054]
As described above, according to the functional configuration of the present embodiment, the low-frequency part performs conventional encoding while the high-frequency part has an extremely small amount of information while being based on the amount of information in a range that does not increase significantly compared to the conventional case. By performing the encoding with, it is possible to encode and decode a high-quality acoustic signal.
[0055]
In addition, in the configuration of encoding apparatus 300 and decoding apparatus 400 in the present embodiment, data separation section 330, second quantization section 345 and second encoding section 355 are added to conventional encoding apparatus 1000. In addition, since only the second decoding unit 425, the second inverse quantization unit 435, and the inverse quantized data synthesis unit 440 are added to the conventional decoding device 2000, the existing encoding device 1000 and decoding There is an effect that it can be realized without significantly changing the configuration of the apparatus 2000.
[0056]
In addition, the bit stream generated by the encoding apparatus 300 according to the present embodiment can be decoded by the conventional decoding apparatus 2000.
Next, the encoding process of each unit of the encoding device 300 in the broadcast system 1 will be specifically described.
[0057]
FIG. 2 is a diagram illustrating a state change of an acoustic signal processed in the acoustic data input unit 310 and the conversion unit 320 of the encoding device 300 illustrated in FIG. 2A is a waveform diagram showing 2048 sample data on the time axis cut out by the acoustic data input unit 310 shown in FIG. 1, and FIG. 2B is a waveform diagram showing sample data on the time axis. 3 is a waveform diagram showing spectrum data on the frequency axis after being converted by the MDCT 321 of the conversion unit 320 shown in FIG. In FIG. 2A and FIG. 2B, the sample data and spectrum data are shown as analog waveforms, but in actuality, both are digital signals. The same applies to the following waveform diagrams.
[0058]
The acoustic data input unit 310 receives acoustic data sampled at 44.1 kHz. The sound data input unit 310 overlaps and cuts out the 1024 samples before and after the sound data is input every 2048 samples, and outputs them to the conversion unit 320.
[0059]
The conversion unit 320 MDCTs the data of a total of 4096 samples, but since the spectrum obtained by MDCT has a symmetrical waveform, the spectrum data corresponding to half of the 2048 samples is output as shown in FIG. To do.
[0060]
In the spectrum data shown in FIG. 2 (b), the vertical axis indicates the frequency spectrum value, that is, the amount (size) of the frequency component of the acoustic data represented by the voltage value of 2048 samples in FIG. 2 (a). , 2048 points corresponding to the number of samples. Further, since the acoustic signal input to the encoding device 300 is A / D converted at the sampling frequency of 44.1 kHz, the reproduction band of the spectrum data is 22.05 kHz. Furthermore, since the spectrum obtained by MDCT 321 may take a negative value as shown in FIG. 2B, when the spectrum obtained by MDCT 321 is encoded, the sign of the spectrum is also added. It is necessary to make it. Hereinafter, in order to avoid confusion with the encoding code, information representing the positive and negative signs of the spectrum data is referred to as “sign information”.
[0061]
Spectral data and sign information output from the conversion unit 320 are separated into a low frequency range of 0 to 11.025 kHz and a high frequency range of 11.025 kHz in the data separation unit 330, and spectral data of the low frequency range Are output to the first quantization unit 340, and the spectral data of the high frequency region are output to the second quantization unit 345, respectively.
[0062]
FIG. 3 is a flowchart showing an operation in the scale factor determination process of the first quantizing unit 340 shown in FIG.
The first quantization unit 340 first determines a common scale factor for each scale factor band as an initial value of the scale factor (S91), and uses that scale factor as acoustic data for one frame (1024 samples). All the low-frequency spectrum data to be transmitted is quantized, the difference before and after the obtained scale factor is obtained, and the difference, the head scale factor, and each quantized value are Huffman-coded (S92). Note that quantization and encoding here are performed only for counting the number of bits, and therefore, in order to simplify the processing, only data is performed, and information such as a header is not added.
[0063]
Next, the first quantization unit 340 determines whether or not the number of bits of the Huffman-encoded data exceeds a predetermined number of bits (S93), and if it exceeds, decreases the initial value of the scale factor ( S101) Using the value of the scale factor, quantization and Huffman coding are performed again for the same lowband spectrum data (S92), and lowband coding data for one frame after Huffman coding is performed. It is determined whether or not the number of bits exceeds a predetermined number of bits (S93), and this process is repeated until the number of bits becomes equal to or less than the predetermined number of bits.
[0064]
If the number of bits of the low band encoded data does not exceed the predetermined number of bits, the first quantization unit 340 repeats the following processing for each scale factor band, and determines the scale factor of each scale factor band. (S94). First, each quantized value in the scale factor band is inversely quantized (S95), and the difference between each absolute value between each inverse quantized value and the corresponding original spectrum data is obtained and summed (S96). Further, it is determined whether or not the sum of the obtained differences is within the allowable range (S97), and if it is within the allowable range, the above processing is repeated for the next scale factor band (S94 to S98). .
[0065]
On the other hand, if the allowable range is exceeded, the scale factor value is increased and the spectrum data of the scale factor band is quantized (S100), and the quantized value is inversely quantized (S95). The difference in absolute value between the value and the corresponding spectrum data is summed (S96). Further, it is determined whether or not the sum of the differences is within the allowable range (S97), and if it exceeds the allowable range, the scale factor is sequentially increased until it is within the allowable range (S100), and the above processing (S95 to S97 and S100) is repeated.
[0066]
The first quantizing unit 340 is configured such that, for all scale factor bands, the sum of absolute value differences between the values obtained by dequantizing the quantized values in the scale factor bands and the original spectrum data falls within the allowable range. When the scale factor is determined (S98), the low-frequency spectrum data for one frame is quantized again using the determined scale factor, and the difference between each scale factor, the head scale factor, and each quantized value are obtained. Huffman coding is performed, and it is determined whether or not the number of bits of the low-frequency part encoded data exceeds a predetermined number of bits (S99). If the number of bits of the low-frequency encoded data exceeds the predetermined number of bits, the initial value of the scale factor is lowered until it becomes equal to or less than the predetermined number of bits (S101), and then the scale within each scale factor band The process for determining the factor (S94 to S98) is repeated. If the number of bits of the low frequency band encoded data does not exceed the predetermined number of bits (S99), the value of each scale factor at that time is determined as the scale factor of each scale factor band.
[0067]
The first quantizing unit 340 quantizes the low-frequency spectrum data using the scale factor determined in this way, and obtains the quantized value, the head of the determined scale factor, and the difference between the scale factors. The signature information received from the data separator 330 is output to the first encoder 350.
[0068]
It should be noted that whether the sum of the absolute values of the quantized values in the scale factor band and the inverse spectral values of the quantized values falls within the allowable range is determined based on data such as a psychoacoustic model. Is called.
[0069]
Also, here, the initial value of the scale factor is set to a relatively large value, and when the number of bits of the low band encoded data after Huffman coding exceeds a predetermined number of bits, the scale factor is sequentially changed. Although the scale factor is determined by a method of lowering the initial value, it is not always necessary to do so. For example, when the initial value of the scale factor is set to a low value in advance, the initial value is gradually increased, and the total number of bits of the low-frequency encoded data first exceeds the predetermined number of bits. Thus, the scale factor of each scale factor band may be determined using the initial value of the scale factor set immediately before.
[0070]
Furthermore, although the scale factor of each scale factor band is determined here so that the number of bits of the entire low-frequency encoded data for one frame does not exceed a predetermined number of bits, this need not necessarily be done. For example, in each scale factor band, the scale factor may be determined so that each quantized value in the scale factor band does not exceed a predetermined number of bits. Hereinafter, the operation of the first quantization unit 340 in this process will be described with reference to FIG.
[0071]
FIG. 4 is a flowchart showing an operation in another scale factor determination process of the first quantization unit 340 shown in FIG.
The first quantizing unit 340 calculates scale factors according to the following procedure for all the scale factor bands in the low frequency range to be encoded (S1). Further, the first quantization unit 340 calculates the scale factor according to the following procedure for all spectrum data in each scale factor band (S2).
[0072]
First, the first quantizing unit 340 quantizes the spectrum data based on a formula with a value of a predetermined scale factor (S3), and the quantized value represents a predetermined number of bits to represent the quantized value. For example, it is determined whether or not it exceeds 4 bits (S4).
[0073]
As a result of the determination, if the quantized value exceeds 4 bits, the scale factor value is adjusted (S8), and the same spectrum data is quantized with the adjusted scale factor value (S3). The first quantization unit 340 determines whether or not the obtained quantized value exceeds 4 bits (S4), and adjusts the scale factor until the quantized value of the spectrum data becomes a value of 4 bits or less. (S8) and quantization by the scale factor after adjustment (S3) are repeated.
[0074]
As a result of the determination, if the quantized value is 4 bits or less, the next spectrum data is quantized with a value of a predetermined scale factor (S3).
When the quantized values of all the spectrum data in one scale factor band are 4 bits or less (S5), the first quantizing unit 340 converts the scale factor value at that time into the scale factor of the scale factor band. (S6).
[0075]
Furthermore, when the first quantizing unit 340 determines scale factors for all scale factor bands (S7), the process is terminated.
With the above processing, one scale factor is determined for each of all the scale factor bands in the low frequency band to be encoded. The first quantizing unit 340 quantizes the spectrum data of the low frequency band using the scale factor determined in this way, and a 4-bit quantized value as a quantization result and the 8-bit scale factor. And the difference between the scale factors are output to the first encoding unit 350 together with the sign information received from the data separation unit 330.
[0076]
The quantized value, scale factor, and the like output to the first encoding unit 350 are Huffman encoded and output to the stream output unit 390 as a first encoded signal equivalent to the case of downsampling.
[0077]
On the other hand, the second quantizing unit 345 generates auxiliary information based on the spectral data of the high frequency region.
FIG. 5 is a spectrum waveform diagram showing a specific example of the auxiliary information (scale factor) generated by the second quantization unit 345 shown in FIG. 1, and FIG. 6 is the second quantization unit shown in FIG. 345 is a flowchart illustrating an operation in the auxiliary information (scale factor) calculation process 345.
[0078]
In FIG. 5, the delimiters shown on the low-frequency part frequency axis represent the delimiters of the scale factor bands defined in the present embodiment. In addition, a break indicated by a broken line in the frequency direction in the high frequency part indicates a scale factor band break defined in the present embodiment. The same applies to the following waveform diagrams.
[0079]
Of the spectrum data output from the conversion unit 320, a low-frequency part having a reproduction band of 11.025 kHz or less indicated by a solid waveform in FIG. 5 is output to the first quantization unit 340 and quantized as usual. On the other hand, the high frequency region up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz indicated by the broken line waveform in FIG. 5 is represented by auxiliary information (scale factor) calculated by the second quantization unit 345. .
[0080]
Hereinafter, the procedure for calculating auxiliary information (scale factor) of the second quantizing unit 345 will be described with reference to the flowchart of FIG. 6 using the specific example of FIG.
The second quantizing unit 345 sets the quantized value of the absolute maximum spectrum data in each scale factor band to “1” for all scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz. Is calculated according to the following procedure (S11).
[0081]
The second quantizing unit 345 specifies the absolute maximum spectrum data (peak) in the first scale factor band of the high frequency region exceeding the reproduction band of 11.0525 kHz (S12). In the specific example of FIG. 5, it is assumed that the position of the peak specified in the first scale factor band is (1) and the peak value at that time is “256”.
[0082]
Similar to the procedure shown in the flowchart of FIG. 4, the second quantization unit 345 assigns the peak value “256” and the initial scale factor value to the formula for calculating the quantized value, and obtains it from the formula. The value of the scale factor sf with which the quantized value to be “1” is calculated (S13). For example, in this case, a value of the scale factor sf that sets the quantization value of the peak value “256” to “1”, for example, sf = 24, is calculated.
[0083]
When the scale factor value sf = 24 for setting the peak quantized value to “1” is obtained for the first scale factor band (S14), the second quantizing unit 345 determines the spectrum for the next scale factor band. The peak of the data is specified (S12). For example, when the specified peak position is (2) and the value is “312”, the quantized value of the peak value “312” is “1”. A value of the scale factor sf, for example, sf = 32 is calculated (S13).
[0084]
Similarly, the second quantizing unit 345 sets the value of the scale factor sf for setting the quantized value of the value “288” of the peak (3) to “1” for the third scale factor band in the high frequency range, For example, sf = 26 is calculated, and the value of the scale factor sf, for example, sf = 18, for setting the quantized value of the value “203” of the peak (4) to “1” is calculated for the fourth scale factor band.
[0085]
In this way, when the scale factor for setting the quantized value of the peak value to “1” is calculated for all the scale factor bands in the high frequency part (S14), the second quantizing unit 345 calculates by the calculation The obtained scale factor of each scale factor band is output to the second encoding unit 355 as auxiliary information of the high frequency part, and the process is terminated.
[0086]
As described above, the auxiliary information (scale factor) is generated by the second quantizing unit 345. The auxiliary information (scale factor) is obtained by converting the high frequency portion represented by the spectrum data of 1024 points into each of the high frequency portions. If the value of the scale factor is represented by a value from 0 to 255, it can be represented by 8 bits for each scale factor band (four in this case) in the high frequency region. Further, if the difference between the scale factors is Huffman-encoded, the amount of data may be further reduced. On the other hand, if the spectrum data of 1024 points in the high frequency part is quantized and Huffman encoded by the conventional method as in the low frequency part, it is predicted that the data amount will be at least about 300 bits. Therefore, this auxiliary information shows only one scale factor for each scale factor band in the high frequency part, but the amount of data is larger than when the high frequency part is quantized according to the conventional method. It can be seen that it has been reduced.
[0087]
The scale factor is a value that is substantially proportional to the peak value (absolute value) in each scale factor band, and is one of the spectrum data that takes a constant value at 1024 points in the high frequency region or the spectrum data in the low frequency region. It can be said that the spectrum data obtained by multiplying a copy of all or all copies by the scale factor roughly restores the spectrum data obtained based on the input acoustic signal. Also, for each scale factor band, the coefficient is the ratio between the absolute maximum value of the spectrum data copied in the band and the value obtained by dequantizing the quantized value “1” using the scale factor value corresponding to the band. As described above, the spectral data can be restored more accurately by multiplying each spectral data in the band. Furthermore, since the difference in the waveform of the high frequency part is not as clearly audibly identified as the low frequency part, the auxiliary information obtained in this way is sufficient as information representing the waveform of the high frequency part. It can be said.
[0088]
Here, the scale factor is calculated so that the quantized value of the spectrum data in each scale factor band in the high frequency region is “1”. However, the scale factor is not necessarily “1” and is set to another value. You may keep it.
[0089]
The auxiliary information generated by the second quantization unit 345 is Huffman encoded by the second encoding unit 355, and the conventional decoding apparatus in the bit stream as the second encoded signal by the stream output unit 390. It is ignored or stored in an area where its operation is not specified.
[0090]
FIG. 7 is a diagram illustrating a position in the bit stream in which auxiliary information is stored by the stream output unit 390 illustrated in FIG. In FIG. 7, after the auxiliary information representing the spectrum of the high frequency band is encoded, it is stored as a second encoded signal in a region that is not recognized as an acoustic encoded signal in the bit stream.
[0091]
In FIG. 7A, a hatched portion is, for example, a region (Fill Element) that is filled with “0” in order to match the data length of the bit stream. Even if the information, that is, the second encoded signal is stored, the conventional decoding apparatus 2000 does not recognize the encoded signal to be decoded and ignores it.
[0092]
In FIG. 7B, the hatched portion is, for example, an area called Data Stream Element (DSE), and this area has only a physical structure such as a bit length according to the AAC standard for future expansion. This is a defined area. Similar to the Fill Element, this area is read even if auxiliary information representing the spectrum in the high frequency part is stored here or ignored by the conventional decoding apparatus 2000 or even if the data is read. This is an area where the operation of the decryption apparatus 2000 for data is not defined.
[0093]
In the above description, the second encoded signal is stored in an area in the bitstream that is ignored by the conventional decoding apparatus 2000 according to the MPEG-2 AAC standard. The second encoded signal may be incorporated at a predetermined position in the first encoded signal, or may be incorporated across both. In addition, since the second encoded signal is stored in the bit stream, it is not necessary to secure a continuous area in the header and the first encoded signal. That is, as shown in FIG. 7C, the second encoded signal may be incorporated discontinuously in the header information and the first encoded information.
[0094]
FIG. 8 is a diagram illustrating another example when the stream output unit 390 illustrated in FIG. 1 stores auxiliary information. FIG. 8A shows a stream 1 in which only the first encoded signal is stored continuously for each frame. FIG. 8B shows the stream 2 in which only the second encoded signal in which the auxiliary information is encoded is continuously stored for each frame corresponding to the stream 1.
[0095]
The stream output unit 390 may store the second encoded signal in a stream 2 that is completely different from the stream 1 that is a bit stream in which the first encoded signal is stored. For example, stream 1 and stream 2 are bit streams transmitted on different channels.
[0096]
In this way, by transmitting the first encoded signal and the second encoded signal in completely different bit streams, a low-frequency portion representing basic information of the input acoustic signal is transmitted or accumulated in advance, There is an effect that the high frequency band information can be added later if necessary.
[0097]
In the format shown in FIGS. 7 and 8, the information indicating the sampling frequency of the bitstream stored in the header stores information of 22.05 kHz, which is half of the actual sampling frequency. As a result, the decoding apparatus 2000 of the first conventional example can also decode the bit stream in the 0 to 11.025 kHz band and reproduce it in the same manner as in the case of downsampling.
[0098]
Next, differences between the scheme of encoding apparatus 300 according to the embodiment of the present invention and the scheme of encoding apparatus 1000 of Conventional Example 1 will be described with reference to FIG. FIG. 9 is a diagram showing a comparison between the method according to the embodiment and the method according to Conventional Example 1 (downsampling method), and particularly FIG. 9A is a diagram showing the method according to the embodiment. FIG. 9B is a diagram showing a method according to Conventional Example 1.
[0099]
In the method according to the present embodiment, by setting the sampling frequency to 44.1 kHz, an acoustic data sequence is acquired every 22.7 μsec, and 2048 included in the encoding target frame and 1024 before and after that are included. A total of 4096 pieces are cut out and 2048 pieces of spectral data are obtained by MDCT. The reproduction band of this spectrum data represents 22.05 kHz. The 2048 spectral data are separated into spectral data (1024) in the low band part and 1024 spectral data (1024) in the high band part with a boundary of 11.25 kHz. The low-frequency spectrum data (1024) is subjected to normal quantization and encoding, and a first encoded signal having a high quality and a low bit rate equivalent to that of downsampling is obtained. In addition, 1024 pieces of spectral data are acquired for the high frequency region. If this is normally quantized and encoded, a low bit rate cannot be realized. Therefore, in the method according to the present embodiment, auxiliary information is generated from 1024 pieces of spectrum data in the high frequency band, and only the auxiliary information is encoded to obtain the second encoded signal. Therefore, it is possible to encode a high-quality acoustic signal within a range where the total amount of information does not increase significantly.
[0100]
On the other hand, in the method by the downsampling in the conventional example 1, by setting the sampling frequency to 22.05 kHz, an acoustic data string is obtained every 45 μsec, and 1024 included in the encoding target frame, A total of 2048 pieces including 512 pieces before and after are cut out and MDCT is performed to obtain 1024 pieces of spectrum data. The reproduction band of this spectrum data represents 11.025 kHz. The 1024 pieces of spectrum data are subjected to normal quantization / encoding. Therefore, although a high-quality encoded signal can be acquired at 11.025 kHz or less, there is no high-frequency spectrum data exceeding 11.025 kHz, and thus a high-frequency encoded signal cannot be acquired.
[0101]
Next, the difference between the method of encoding apparatus 300 according to the embodiment of the present invention and the method of encoding apparatus 1000 of Conventional Example 2 will be described with reference to FIG.
FIG. 10 is a diagram showing a comparison between the method according to the embodiment and the method according to the second conventional example, and in particular, FIG. 10A is a diagram showing the method according to the embodiment, and FIG. ) Is a diagram showing a method according to Conventional Example 2. FIG. Since the method according to the present embodiment has been described above, the description thereof is omitted.
[0102]
In the sampling method of Conventional Example 2, an acoustic data string is acquired every 22.7 μsec by setting the sampling frequency to 44.1 kHz as in the case of the present embodiment, and is included in the encoding target frame. A total of 2048 data including 1024 data and 512 data before and after that data are cut out and MDCT is performed to obtain 1024 spectral data. The reproduction band of this spectrum data represents 22.05 kHz. The 1024 pieces of spectrum data are subjected to normal quantization / encoding. That is, 1024 pieces of spectrum data (512 in the low range below 11.025 kHz, 512 in the high range exceeding 11.025 kHz) are acquired every half time (about 22.7 msec) of the present embodiment. ing.
[0103]
Here, it is assumed that the auxiliary information is generated from the high-frequency spectrum data of 11.025 to 22.05 kHz in the encoding apparatus 1000 of the conventional example 2 as in the embodiment of the present invention. In this case, if the number of bits that can be used for quantization at about every 22.7 msec is n and the number of bits that can be used as auxiliary information is m1, 512 samples in the low band (0 to 11.025 kHz) are (n−m1). ) It needs to be quantized with bits. On the other hand, in the embodiment of the present invention, when the number of bits per about 45.4 msec that can be used for quantization is 2 × n and the number of bits that can be used as auxiliary information is m2, It is only necessary to quantize 1024 samples (11.025 kHz) with (2 × n−m 2) bits.
[0104]
By the way, in AAC, it is generally known that high coding efficiency cannot be obtained unless a certain number of samples (threshold) or more are collected. In the case of 1024 samples of this embodiment, the threshold value is sufficiently exceeded.
[0105]
Therefore, it is higher to quantize 1024 samples with (2 × n−m2) bits as in the present embodiment than to quantize 512 samples with (n−m1) bits as in Conventional Example 2. Encoding efficiency is obtained. Further, as a result of obtaining higher encoding efficiency in the present embodiment, m2 can be increased (m2> 2 × m1), and the sound quality in the high frequency region can be improved.
[0106]
FIG. 11 is a diagram comparing spectral data and characteristics of the coding method according to the present embodiment and the coding methods of the first and second conventional examples.
In this embodiment, the sampling frequency is 44.1 kHz and the frame length is 2048. For this reason, in the spectral data, 1024 pieces of spectral data are acquired in the low frequency range of 0 to 11.025 kHz, and auxiliary information based on 1024 spectral data of the high frequency range of 11.025 to 22.05 kHz is acquired. . As a result, in the present embodiment, the bandwidth is substantially the same as that of Conventional Example 2, but is wider than that of Conventional Example 1. Further, in the present embodiment, in terms of sound quality, the low-frequency part of 0 to 11.025 kHz is the same quality as the conventional example 1, but the high-frequency part of 11.025 to 22.05 kHz is auxiliary. Since there is information, it becomes high quality as a whole, and with respect to the conventional example 2, it is possible to obtain substantially the same quality because there is auxiliary information for the high frequency part of 11.025 to 22.05 kHz, and 0 to 11.025 kHz. Since the low-frequency part is of high quality because the number of spectrum data is doubled, the overall quality is high.
[0107]
On the other hand, in the conventional example 1, the sampling frequency is 22.05 kHz, the frame length is 1024, and 1024 pieces of spectrum data are acquired in the band of 0 to 11.025 kHz as the spectrum data. As a result, the conventional example 1 is halved and narrower in terms of the bandwidth than the present embodiment. For this reason, in terms of sound quality, the low frequency range of 0 to 11.25 kHz is the same quality, but the high frequency range of 11.025 to 22.05 kHz deteriorates because there is no spectral data at all, and is low overall. Become.
[0108]
In Conventional Example 2, the sampling frequency is 44.1 kHz, the frame length is 1024, and 1024 pieces of spectral data are acquired in the entire band from 0 to 22.05 kHz. As a result, the conventional example 2 is the same as the present embodiment in terms of bandwidth, but in terms of sound quality, spectrum data is encoded for the high frequency range of 11.025 to 22.05 kHz. Although the quality is high, the number of spectrum data is halved in the low frequency band of the band 0 to 11.025 kHz, so that the quality is lowered and the quality is lowered as a whole.
[0109]
Therefore, according to the present embodiment, the low-frequency part performs the conventional encoding, and the high-frequency part is encoded with a very small amount of information, so that the total amount of information is significantly increased compared to the conventional case. It is possible to encode a high-quality acoustic signal within a range that does not.
[0110]
Next, the encoding process of each unit of the decoding device 400 in the broadcast system 1 will be specifically described.
The first encoded signal output from the stream input unit 410 is decoded into quantized data or the like by the first decoding unit 420, and the first dequantization unit 430 encodes the low band spectral data. It becomes. On the other hand, the second decoded signal output from the stream input unit 410 is decoded into auxiliary information by the second decoding unit 425. The second inverse quantization unit 435 generates spectral data of a high frequency part based on the auxiliary information. The processing in the second inverse quantization unit 435 will be described in detail.
[0111]
FIG. 12 is a flowchart illustrating a procedure in which the low frequency band 1024 spectrum is copied to the high frequency band in the forward direction by the second inverse quantization unit 435 illustrated in FIG. 1. Such a copy of the spectrum data of the low frequency part is executed when generating the spectrum data of the high frequency part.
[0112]
In FIG. 12, inv_spec1 [i] indicates the value of the i-th spectrum in the output data of the first inverse quantization unit 430, and inv_spec2 [j] is the input data of the second inverse quantization unit 435. Among these, the value of the j-th spectrum is shown.
[0113]
First, since the second inverse quantization unit 435 inputs the 0th spectrum to the 1023rd spectrum in the same direction, the initial values of the counters i and j for counting the number of spectra are set to “0”, respectively. (S71). Next, the second inverse quantization unit 435 checks whether or not the value of the counter i is less than “1024” (S72), and if the value of the counter i is less than “1024”, the first inverse quantum The value of the i-th (0th in this case) spectrum of the quantization unit 430 is input as the value of the j-th (0th in this case) spectrum of the second inverse quantization unit 435. (S73). Thereafter, the second inverse quantization unit 435 increments the values of the counters i and j by “1” (S74), and checks whether the value of the counter i is less than “1024” (S72). .
[0114]
The second inverse quantization unit 435 repeats the above process while the value of the counter i is less than “1024”, and ends the process when the value of the counter i becomes “1024” or more.
[0115]
As a result, the entire spectrum of the 0 to 1023rd low-frequency part, which is the inverse quantization result of the first inverse quantization unit 430, is directly copied as the high-frequency part spectrum of the second inverse quantization unit 435. .
[0116]
The spectrum data thus copied is adjusted in the amplitude of the copied spectrum data according to the auxiliary information decoded by the second decoding unit 425, that is, the value of the scale factor for setting the peak value to “1”. And output as high-frequency spectrum data. The amplitude is adjusted by dequantizing the quantized value “1” for each scale factor band using the absolute maximum value of the spectrum data copied in the band and the scale factor value corresponding to the band. This is achieved by multiplying each spectrum data in the band by using the ratio of the above as a coefficient. Here, the maximum number of samples of the spectrum data output from the second inverse quantization unit 435 is 1024, which represents a reproduction band exceeding 11.025.
[0117]
In FIG. 12, the procedure for copying the low-frequency part 1024 spectrum to the high-frequency part in the forward direction in the frequency axis direction is shown, but it may be copied in the opposite direction as shown in FIG.
[0118]
FIG. 13 is a flowchart illustrating a procedure in which the low frequency band 1024 spectrum is copied to the high frequency band in the reverse direction of the frequency axis direction by the second inverse quantization unit 435 illustrated in FIG. Similar to FIG. 12, inv_spec1 [i] in FIG. 13 indicates the value of the i-th spectrum in the output data of the first inverse quantization section 430, and inv_spec2 [j] is the second inverse quantization. The value of the j-th spectrum in the input data of the part 435 is shown.
[0119]
First, since the second inverse quantization unit 435 inputs the 0th spectrum to the 1023rd spectrum in the reverse direction, the initial value of the counter i for counting the number of spectra is set to “0”, and the initial value of j The value is set to “1023” (S81). Next, the second inverse quantization unit 435 checks whether or not the value of the counter i is less than “1024” (S82), and if the value of the counter i is less than “1024”, the first inverse quantum The value of the i-th (0th in this case) spectrum of the quantization unit 430 is input as the value of the j-th (1023th) spectrum of the second inverse quantization unit 435. (S83). Thereafter, the second inverse quantization unit 435 increments the value of the counter i by “1”, decrements the value of j by “1” (S84), and the value of the counter i is less than “1024”. It is checked whether or not there is (S82).
[0120]
The second inverse quantization unit 435 repeats the above process while the value of the counter i is less than “1024”, and ends the process when the value of the counter i becomes “1024” or more.
[0121]
As a result, the entire spectrum of the 0 to 1023 low frequency band, which is the inverse quantization result of the first dequantization unit 430, is the 1023 to 0th spectrum of the high frequency band of the second dequantization unit 435. Is copied in the reverse direction.
[0122]
Even in this case, the copied spectrum data has auxiliary information decoded by the second decoding unit 425, that is, the amplitude of the spectrum data copied according to the value of the scale factor that makes the peak value “1”. It is adjusted and output as high-frequency spectrum data. The amplitude is adjusted by dequantizing the quantized value “1” for each scale factor band using the absolute maximum value of the spectrum data copied in the band and the scale factor value corresponding to the band. This is achieved by multiplying each spectrum data in the band by using the ratio of the above as a coefficient. Here, the maximum number of samples of the spectrum data output from the second inverse quantization unit 435 is 1024, which represents a reproduction band exceeding 11.025.
[0123]
Note that the second inverse quantization unit 435 copies all the spectral data in the low-frequency part to the high-frequency part, but may copy only a part.
In addition, as an example of the procedure for copying the entire high frequency band and low frequency band at the same time, the case of FIG. 12 and FIG. 13 is given as an example. May be.
[0124]
Also, some or all of them may be copied with the positive and negative signs reversed. Furthermore, these copying procedures may be determined in advance, may be changed according to the data in the low frequency region, or may be transmitted as auxiliary information.
[0125]
Further, here, the spectral data on the low frequency side is copied as the spectral data on the high frequency side, but this is not limiting, and the spectral data on the high frequency side is generated only from the second encoded information. May be.
[0126]
Furthermore, in the present embodiment, a case has been described in which spectral data obtained mainly from the first inverse quantization unit 430 is copied as noise generation in the second inverse quantization unit 435, but the present invention is limited to this. In addition, spectral data having a constant value within each scale factor band of the high frequency part, white noise, pink noise, etc. may be generated independently by the second inverse quantization unit 435, and supplementary information It may be generated accordingly.
[0127]
The 1024 pieces of spectrum data output from the second inverse quantization unit 435 are combined with the spectrum data (1024 pieces) output from the first inverse quantization unit 430 in the inverse quantization data combining unit 440, and the IMDCT By doing so, it is inversely converted into acoustic data on the time axis, and by performing D / A conversion at a sampling frequency of 44.1 kHz, an audio signal having a reproduction band of 0 to 22.05 kHz is reproduced.
[0128]
As described above, according to the present invention, MDCT and IMDCT having twice the conversion length of the conventional method are used, and the conventional coding is performed only on the first 1024 samples with respect to the spectrum data of 2048 samples, and the remaining second 1024. The samples were encoded with a smaller amount of information than before, and were synthesized at the time of decoding.
[0129]
By reducing the amount of information necessary for encoding the spectral data of the latter half 1024 samples, the amount of information necessary for encoding the spectral data of the first 1024 samples can be increased, and the accuracy of the original signal in the low frequency region is improved. However, wideband encoding could be performed.
[0130]
In addition, the bit stream generated by the encoding apparatus according to the present embodiment can be decoded by a conventional decoding apparatus.
Next, various modifications of auxiliary information and its decoding will be described.
[0131]
FIG. 14 is a spectrum waveform diagram showing a specific example of other auxiliary information (quantized value) generated by the second quantizing unit 345 shown in FIG. FIG. 15 is a flowchart showing an operation in another auxiliary information (quantization value) calculation process of the second quantization unit 345 shown in FIG.
[0132]
The second quantization unit 345 predetermines a common scale factor value, for example, “18”, for all scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz. Using the scale factor value “18”, the quantization value of the absolute maximum value spectrum data (peak) in the scale factor band is calculated for each scale factor band (S21).
[0133]
The second quantizing unit 345 specifies the absolute maximum spectrum data (peak) in the first scale factor band in the high frequency part exceeding the reproduction band of 11.0525 kHz (S22). In the specific example of FIG. 14, it is assumed that the peak position specified in the first scale factor band is {circle around (1)} and the peak value at that time is “256”.
[0134]
The second quantization unit 345 applies a common scale factor value “18” and a peak value “256” to the formula for calculating the quantization value, and calculates the quantization value (S23). For example, in this case, when the peak value “256” is quantized with the scale factor value “18”, the quantized value “6” is calculated.
[0135]
When the quantized value “6” of the peak value “256” is obtained for the first scale factor band (S24), the second quantizing unit 345 specifies the peak of the spectrum data for the next scale factor band. (S22) For example, if the specified peak position is (2) and the value is “312”, the quantized value of the peak value “312” with the scale factor value “18”, for example, “10” is calculated (S23).
[0136]
Similarly, the second quantizing unit 345 has the quantized value “9” of the peak “3” value “288” with the scale factor value “18” for the third scale factor band in the high frequency region. ”And the quantized value“ 5 ”of the value“ 203 ”of the peak (4) with the scale factor value“ 18 ”is calculated for the fourth scale factor band.
[0137]
In this manner, when the quantized value of the peak value when the scale factor is fixed to “18” is calculated for all the scale factor bands in the high frequency part (S24), the second quantizing unit 345 Then, the quantized value of each scale factor band obtained by the calculation is output to the second encoding unit 355 as auxiliary information of the high frequency part, and the process is terminated.
[0138]
As described above, the auxiliary information (quantized value) is generated by the second quantizing unit 345. The auxiliary information (quantized value) is obtained by converting the high-frequency part represented by the spectrum data of 1024 points. Each of the four scale factor bands is represented by a 4-bit quantized value. On the other hand, in the auxiliary information (scale factor) described above, the high frequency region is expressed by an 8-bit scale factor for each of the four scale factor bands. It has been reduced more. Further, this quantized value roughly represents the amplitude of the peak value (absolute value) in each scale factor band, and is one of spectral data that takes a constant value at 1024 points in the high frequency region or spectral data in the low frequency region. Even spectral data obtained by simply multiplying a copy of all or part of the copy can be said to roughly restore the spectral data obtained based on the input acoustic signal. Also, for each scale factor band, the ratio between the absolute maximum value of the spectrum data copied in the band and the value obtained by dequantizing the quantized value corresponding to that band using a predetermined scale factor value By multiplying each spectrum data in the band by using as a coefficient, the spectrum data can be restored with higher accuracy.
[0139]
In this embodiment, the scale factor value corresponding to the quantized value transmitted as the second encoded information is set in advance. However, an optimal scale factor value is calculated and the second code is calculated. It may be transmitted in addition to the conversion information. For example, if the scale factor is selected so that the maximum value of the quantized value is 7, the number of bits representing the quantized value can be 3 bits, so that the amount of information necessary for transmitting the quantized value can be reduced. .
[0140]
FIG. 16 is a spectrum waveform diagram showing a specific example of other auxiliary information (position information) generated by the second quantization unit 345 shown in FIG. FIG. 17 is a flowchart showing an operation in another auxiliary information (position information) calculation process of the second quantization unit 345 shown in FIG.
[0141]
The second quantization unit 345 performs the following procedure for the position of the absolute maximum spectral data in each scale factor band for all the scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz. Therefore, it specifies (S31).
[0142]
The second quantization unit 345 specifies the absolute maximum spectrum data (peak) in the first scale factor band in the high frequency region exceeding the reproduction band of 11.0525 kHz (S32). In the specific example of FIG. 16, it is assumed that the position of the peak specified in the first scale factor band is {circle around (1)} and is the 22nd spectrum data from the head of this scale factor band. The second quantization unit 345 holds the position of the identified peak “the 22nd spectrum data from the head of the scale factor band” (S33).
[0143]
When the peak position is specified and held for the first scale factor band (S34), the second quantization unit 345 specifies the peak of the spectrum data for the next scale factor band (S32). For example, assume that the specified peak position is {circle around (2)} and is the 60th spectrum data from the head of the scale factor band. The second quantizing unit 345 holds the position of the specified peak “60th spectrum data from the head of the scale factor band” (S33).
[0144]
In the same manner, the second quantizing unit 345 specifies and holds the position of the peak {circle around (3)} "the first spectral data of the scale factor band" for the third scale factor band in the high frequency part. For the fourth scale factor band, the position of peak (4) “25th spectrum data from the beginning of the scale factor band” is specified and held.
[0145]
In this way, when the peak positions are specified and held for all the scale factor bands in the high frequency region (S34), the second quantizing unit 345 causes the peak of each scale factor band held. Is output to the second encoding unit 355 as auxiliary information of the high frequency region, and the process ends.
[0146]
As described above, auxiliary information (position information) is generated by the second quantizing unit 345. This auxiliary information (position information) is obtained by converting the high frequency part represented by the spectrum data of 1024 points to 4 Each scale factor band is represented by 6-bit position information.
[0147]
In this case, in the decoding device 400, the second inverse quantization unit 435 receives a part or all of the spectrum data for 1024 samples in the low band part as auxiliary information (from the second decoding unit 425 ( Is copied as 1024 sample data on the high frequency side according to the position information. In the copying procedure, based on the peak information of the spectrum data in one or more scale factor bands, similar data is extracted from the spectrum data output from the first inverse quantization unit 430, and part or all of the data is extracted. This is achieved by copying. In addition, the second inverse quantization unit 435 adjusts the amplitude of the copied spectral data as necessary. Amplitude adjustment is achieved by multiplying each spectral data by a predetermined coefficient, for example, “0.5”. This coefficient may be a fixed value, may be changed for each band or for each scale factor band, or may be changed according to spectrum data output from the first inverse quantization unit 430.
[0148]
In the above description, a predetermined coefficient is used, but the value of this coefficient may be added to the second encoded information as auxiliary information. Alternatively, the scale factor value may be added to the second encoded information as a coefficient, or the quantized value of the peak in the scale factor band may be added to the second encoded information as the coefficient. The amplitude adjustment method is not limited to this, and other methods may be used.
[0149]
In this embodiment, only position information or only position information and coefficient information is encoded as auxiliary information. However, the present invention is not limited to this, and scale factor, quantized value, spectrum sign information, and A noise generation method or the like may be encoded. Moreover, you may encode combining these 2 or more.
[0150]
Moreover, although the spectrum data on the low band side is copied as the spectrum data on the high band section side, the present invention is not limited to this, and the spectrum data on the high band section side may be generated only from the second encoded information. .
[0151]
FIG. 18 is a spectrum waveform diagram showing a specific example of other auxiliary information (sign information) generated by the second quantization unit 345 shown in FIG. FIG. 19 is a flowchart showing an operation in another auxiliary information (sign information) calculation process of the second quantization unit 345 shown in FIG.
[0152]
The second quantizing unit 345 has a predetermined position of each scale factor band, for example, at the center of the scale factor band, for all the scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz. The signature information of the spectrum data is specified according to the following procedure (S41).
[0153]
The second quantizing unit 345 checks the sign information of the spectrum data at the center position of the first scale factor band in the high frequency region exceeding the reproduction band of 11.0525 kHz (S42), and holds the value. For example, the sign code of the spectrum data at the center position of the first scale factor band is “+”. The second quantizing unit 345 represents this code “+” as a 1-bit value “1” and holds it. If this sign is “−”, it is represented by “0” and held.
[0154]
When the sign information of the spectrum data at the center position of the scale factor band is held for the first scale factor band (S43), the second quantization unit 345 determines the spectrum data at the center position for the next scale factor band. The sign is checked (S42). For example, if the checked code is “+”, the second quantization unit 345 holds “1” as the sign information of the spectrum data at the center position of the second scale factor band.
[0155]
Similarly, the second quantization unit 345 checks the sign “+” of the spectrum data at the center position of the third scale factor band in the high frequency region, holds the sign information “1”, and The sign “+” of the spectral data at the center position of the scale factor band is checked, and the sign information “1” is retained.
[0156]
In this way, when the sign information of the spectrum data at the center position is held for all the scale factor bands in the high frequency part (S43), the second quantizing unit 345 holds each scale factor band held. Is output to the second encoding unit 355 as auxiliary information of the high frequency band, and the process is terminated.
[0157]
As described above, auxiliary information (sign information) is generated by the second quantizing unit 345. This auxiliary information (sign information) is obtained by converting the high frequency portion represented by 1024 points of spectral data into 4 Each scale factor band is represented by 1-bit sine information, and the spectrum of the high frequency band can be represented with a very short data length.
[0158]
In this case, in the decoding device 400, the second inverse quantization unit 435 copies a part or all of the spectrum data for 1024 samples of the low band part as the high band side spectrum, and the second decoding unit The sign of the spectrum data at a predetermined position is determined according to the sign information input from 425.
[0159]
Here, the sign information representing the sign of the center position of each scale factor band in the high frequency region is used as auxiliary information (sign information), but is not limited to the position of the center of the scale factor band. It may be sign information, may be sign information at the beginning of a scale factor band, or may be a predetermined position other than that.
[0160]
Here, the position of the spectrum data corresponding to the code (sign information) to be transmitted is predetermined, but this may be changed according to the output of the first inverse quantization unit 430. The position information indicating the position of the sign information of each scale factor band may be added to the second encoded information and transmitted.
[0161]
The second inverse quantization unit 435 adjusts the amplitude of the copied spectral data as necessary. The adjustment of the amplitude can be achieved by multiplying each spectrum data by a predetermined coefficient, for example, a value of “0.5” and multiplying the coefficient.
[0162]
This coefficient may be a fixed value, may be changed for each band or for each scale factor band, or may be changed according to the spectrum data output from the first inverse quantization unit 430. The amplitude adjustment method is not limited to this, and other methods may be used.
[0163]
In this embodiment, a predetermined coefficient is used, but the value of this coefficient may be added to the second encoded information as auxiliary information. Further, a scale factor value may be added to the second encoded information as the coefficient, or a quantized value may be added to the second encoded information as a coefficient.
[0164]
In addition, only sign information, only sign information and coefficient information, only sign information and position information, or only sign information, position information, and coefficient information are encoded as auxiliary information. Alternatively, the quantization value, scale factor, characteristic spectrum position information, noise generation method, and the like may be encoded. Two or more of these may be combined and encoded.
[0165]
Further, in the present embodiment, the spectrum data on the low frequency band side is copied as the spectrum data on the high frequency band side. However, the present invention is not limited to this, and the spectrum data on the high frequency band side is the second encoded information. You may generate from only.
[0166]
In the above description, the code “+” is represented by a 1-bit value “1”, and the code “−” is represented by “0”. It is not limited and may be expressed by other values.
[0167]
FIG. 20 is a spectrum waveform diagram showing an example of a method of creating other auxiliary information (copy information) generated by the second quantization unit 345 shown in FIG. FIG. 20A is a waveform diagram showing a spectrum in the first scale factor band in the high frequency region. FIG. 20B is a waveform diagram showing an example of the spectrum waveform of the low frequency region specified by the auxiliary information (copy information). FIG. 21 is a flowchart showing an operation in another auxiliary information (copy information) calculation process of the second quantization unit 345 shown in FIG.
[0168]
The second quantizing unit 345, for all the scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz, is the peak position n from the head of the scale factor band (nth from the beginning). ), The scale factor band number N at which the peak position from the head of the scale factor band is the closest to n in the low frequency region is specified according to the following procedure (S51).
[0169]
The second quantizing unit 345 specifies the position n of the absolute maximum spectral data (peak) in the first scale factor band in the high frequency region exceeding the reproduction band of 11.0525 kHz (S52). As a result, for example, as shown in FIG. 20A, it is assumed that the specified peak position is {circle around (1)} and the spectrum is spectrum data of n = 22 in this scale factor band.
[0170]
The second quantizing unit 345 specifies the positions of all peaks (including both positive and negative) of the spectrum in the low frequency region where the frequency of the spectrum is not more than the reproduction band of 11.0525 kHz (S53).
[0171]
Next, the second quantizing unit 345 searches for all the peaks specified in the low frequency part for a scale factor band whose position from the peak to the beginning of the scale factor band is closest to n, and for the scale factor band. The number N, the search direction, and the peak sign information are specified (S54).
[0172]
Specifically, the second quantizing unit 345 sequentially selects all of the specified peaks (including both positive and negative) from the peak on the low frequency side in the scale factor band whose position from the peak is closest to n. Search for the beginning. There are two search directions: a case where the search is further performed from the peak toward the lower frequency direction (1), and a case where the search is performed further from the peak toward the higher frequency direction (2). In addition, the search for the high frequency peak and the low frequency peak in which the positive / negative sign is reversed is also performed when searching from the peak toward the lower frequency (3), and from the peak toward the higher frequency. There are two ways of searching (4).
[0173]
Among these, when the search directions are (2) and (4), when the spectrum waveform of the low band is copied based on this peak information, the high band is shown as shown in FIG. Since the waveform in which the position of the peak and the position of the low-frequency peak are reversed left and right (frequency axis direction) in the scale factor band is copied, for example, the search directions of (1) and (3) are forward And (2) and (4) as reverse directions, it is necessary to attach information indicating the forward and reverse search directions. When the search directions are (3) and (4), as shown in FIG. 20 (b), the position of the peak in the high frequency region and the position of the peak in the low frequency region are vertically (vertical direction). Since the inverted waveform is copied, it is necessary to attach information indicating whether or not the positive and negative signs of the high band peak value and the low band peak value are reversed.
[0174]
The second quantizing unit 345 determines that the peak specified in the low band is negative in the search directions of (1) and (2) if the peak specified in the low band has a positive value. If the peak has the value of (3) and (4), a search is performed in four directions, and among the search results, the number of the scale factor band whose position from the peak is closest to n is specified. To do. In this case, an error range with n is set to a predetermined value, for example, “5” in advance, and a scale factor band whose position from the peak is closest to n is selected from the four search results. The number N of the scale factor band is specified. At the same time, sign information indicating whether or not the positive and negative signs of the peak value in the high frequency region and the peak value in the low frequency region are inverted, and information indicating the reverse of the search direction are specified.
[0175]
For example, in the search direction (1), corresponding to the low-frequency spectrum as shown in (1) of FIG. 20B, the position error from the peak is “1” and the scale factor band number N = Assume that 3 is specified. Further, in the search direction (2), the scale factor band number N == 5 with the position error from the peak “5” corresponding to the spectrum in the low band as shown in (2) of FIG. 18 is specified, and similarly, in the search direction (3), the error is “4” corresponding to the spectrum of the low frequency region as shown in (3) of FIG. In the search direction (4) with the number N = 12, the scale factor band number N = 10 has an error of “2” corresponding to the low-frequency spectrum as shown in (4) of FIG. Suppose that it was identified. The second quantizing unit 345 has an error in the position from the peak of “4” of the specified scale factor band numbers, and the number N = 3 of the scale factor band closest to the position from the peak n. Select. At the same time, sign information “1” indicating the sign “+” of the peak in the low frequency region and search direction information “1” indicating that the search is further performed from the peak toward the low frequency direction are generated. In this case, if the sign of the peak is “−”, the sign information is “1”, and if searching from the peak further toward the high frequency direction, the search direction information is represented as “0”.
[0176]
When the scale factor band number N = 3, the sign information “1”, and the search direction information “1” are specified for the first scale factor band in the high frequency range (S55), the second quantization unit 345 In the same manner as described above, the scale factor band number N, its sign information, and its search direction information are specified for the next scale factor band.
[0177]
In this way, for all scale factor bands in the high frequency band, the low frequency where the peak position from the head of the scale factor band is closest to n with respect to the peak position n from the head of the scale factor band. When the number N of the scale factor band of the part, the sign information thereof, and the search direction information thereof are specified (S55), the second quantizing part 345 corresponds to each scale factor band of the specified high frequency part. The scale factor band number N in the low frequency part, the sign information, and the search direction information are output to the second encoding unit 355 as auxiliary information (copy information) in the high frequency part, and the process ends.
[0178]
In this case, when the decoding apparatus 400 decodes the first encoded signal according to the conventional procedure, spectrum data of 1024 samples on the low frequency side is obtained. The second inverse quantization unit 435 copies a part or all of the spectrum data corresponding to the scale factor band number output from the second decoding unit 425 as a high frequency band side spectrum. The second inverse quantization unit 435 adjusts the amplitude of the copied spectral data as necessary. The amplitude can be adjusted by multiplying each spectrum by a predetermined coefficient, for example, by setting the value to “0.5” and multiplying the coefficient.
[0179]
This coefficient may be a fixed value, may be changed for each band, for each scale factor band, or may be changed according to the spectrum data output from the first inverse quantization unit 430.
[0180]
In this embodiment, a predetermined coefficient is used for amplitude adjustment. However, the value of this coefficient may be added to the second encoded information as auxiliary information. Further, a scale factor value may be added to the second encoded information as a coefficient, or a quantized value may be added to the second encoded information as a coefficient. The amplitude adjustment method is not limited to this, and other methods may be used.
[0181]
In addition to the scale factor band number N, the sign information and the search direction information are extracted as auxiliary information (copy information) for the high frequency band, but the sign information depends on the amount of information that can be transmitted for the high frequency band. And search direction information may be omitted. The sign information is “1” if the sign of the low-frequency peak is “+”, and “0” if it is “−”, and the search direction information is further from the peak toward the lower frequency. The search is represented as “1”, and the search from the peak toward the higher frequency direction is represented as “0”. However, the sign of the low-frequency peak in the sign information and how to express the search direction of the search direction information are as follows. These are not limited to these, and may be expressed by other values.
[0182]
In addition, the head of the scale factor band whose distance is closest to n from the position of each peak specified in the low frequency part has been searched. However, the present invention is not limited to this example, and the head of each scale factor band in the low frequency part is searched. The peak whose distance is closest to n may be searched.
[0183]
FIG. 22 is a spectrum waveform diagram showing a second example of a method of creating other auxiliary information (copy information) generated by the second quantization unit 345 shown in FIG. FIG. 23 is a flowchart showing an operation in the second calculation process of other auxiliary information (copy information) of the second quantizing unit 345 shown in FIG.
[0184]
The second quantizing unit 345 performs spectral difference (energy difference) with respect to all the spectrums within the scale factor band for all the scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz. The number N of the scale factor band of the low-frequency part that minimizes () is specified according to the following procedure (S61). However, the number of spectra taking the difference from the high frequency region in the low frequency region is equal to the number of spectra in the scale factor band of the high frequency region, and the specified scale factor band number N is the head of the spectrum. The scale factor band number.
[0185]
The second quantizing unit 345 performs processing for all the scale factor bands in the low band part (S62), with the frequency composed of the same number of spectrum data as the spectrum data in the scale factor band in the high band part from the head of the scale factor band. A difference between the spectrum of the high frequency region and the spectrum of the low frequency region is obtained by the width (S63). For example, in the waveform diagram shown in FIG. 22, if the first scale factor band in the high frequency part is a scale factor band with the number of spectral data = 48, the second quantizing unit 345 uses the number N of the low frequency part. For 48 spectral data from the head of the scale factor band of = 1, the difference in spectrum between the high frequency region and the low frequency region is obtained sequentially.
[0186]
When the second quantizing unit 345 obtains the spectrum difference between the high-frequency part and the low-frequency part for the same number of spectra as the scale factor band of the high-frequency part (S65), the second quantizing part 345 holds the value, A difference between the high-frequency spectrum and the low-frequency spectrum is obtained from the beginning of the low-frequency scale factor band with the same frequency width of the spectrum data as the spectrum in the high-frequency scale factor band (S64). For example, when the difference of the spectrum is obtained with the width of 48 spectrum data from the beginning of the scale factor band of the low band number N = 1, the value of the obtained difference is held, A spectrum difference is obtained with a width of 48 spectrum data from the head of the scale factor band of number N = 2. Similarly, the low-frequency part N = 3 scale factor band, the number N = 4 scale factor band,..., The number N = 28 (the last low-frequency part) scale factor band, and so on. For all the scale factor bands in the region, the difference between the 48 spectral data in the high region and the low region is sequentially added to obtain the spectrum difference.
[0187]
For all scale factor bands in the low frequency band, the width of the high frequency spectrum and the low frequency spectrum is the same as the spectrum data in the high frequency scale factor band from the beginning of the scale factor band. When the difference is obtained (S64), the second quantization unit 345 specifies the number N of the scale factor band that minimizes the obtained difference (S65). For example, in the spectrum waveform diagram shown in FIG. 22, it is assumed that the scale factor band of number N = 8 in the low frequency region is specified. This indicates that the spectrum of the portion indicated by the oblique line in the low frequency part has the smallest difference from the spectrum of the part indicated by the oblique line in the high frequency part, and the energy difference between the spectra is the smallest. That is, when 48 spectral data from the head of the scale factor band of number N = 8 are copied to the first scale factor band of the high band starting from 11.25 kHz, they are indicated by a one-dot chain line in the high band of FIG. It becomes a waveform, and can express the energy within the scale factor band of the high frequency part approximately with respect to the original spectrum.
[0188]
When the second quantization unit 345 specifies the number N of the low-frequency scale factor band that minimizes the difference in spectrum for the high-frequency scale factor band, the second quantization unit 345 holds the specified scale factor band number N. In the same manner as described above, the number N of the corresponding scale factor band is specified for the scale factor band of the next high frequency part (S66). Hereinafter, this process is sequentially repeated for each scale factor band of the high frequency part, and when the number N of the low frequency scale factor band having the smallest spectrum difference is specified in all the scale factor bands of the high frequency part, The stored low-band scale factor band number N is output to the second encoding unit 355 as auxiliary information (copy information) for the high band, and the process is terminated.
[0189]
In this case, the low-frequency side spectrum copy method and amplitude adjustment method in decoding apparatus 400 are the same as those in the case of auxiliary information (copy information) described with reference to FIGS.
[0190]
Further, in the flowchart of FIG. 23, when calculating the energy difference between the high frequency region and the low frequency region, the calculation is performed in the same direction and in the same direction on the frequency axis. However, the encoding device of the present invention is limited to this. Instead, as described with reference to FIGS. 20 and 21, the energy difference between the high frequency region and the low frequency region may be calculated using any of the following three methods. (1) The low-frequency scale for the high-frequency spectrum data that is selected from the low-frequency side to the high-frequency side, with the same spectral data values for the high-frequency part and the low-frequency part. Spectral data is sequentially selected from the high frequency side to the low frequency side (that is, in the reverse direction on the frequency axis) for the same number of spectral data as the high frequency region from the beginning of the factor band, and the difference is calculated. (2) The sign of the low frequency spectrum is inverted (minus) and calculated in the same direction on the frequency axis. (3) The sign of the low frequency spectrum is inverted (minus) and the calculation is performed in the reverse direction on the frequency axis. Further, after performing the calculation by all these four methods, the number N of the scale factor band of the low-frequency spectrum that minimizes the energy difference among these may be used as auxiliary information. In this case, in order to correctly copy the low band spectrum with the smallest energy difference to the high band, information indicating the sign relationship between the low band spectrum and the high band spectrum and the low band spectrum Information indicating the direction on the frequency axis for copying the region spectrum is included in the auxiliary information for each scale factor band. The information indicating the relationship between the codes of the low-frequency spectrum and the high-frequency spectrum is, for example, “1” when the difference is obtained with the same code and “0” when the difference is obtained with the opposite code. expressed. The information indicating the direction on the frequency axis when copying the low-frequency spectrum to the high-frequency section is, for example, selecting spectral data in the high-frequency area and the low-frequency area when copying in the forward direction. When the direction is forward, “1”, when copying in the reverse direction, that is, when the direction in which spectrum data is selected in the high-frequency part and the low-frequency part is the reverse direction, “0” is set to 1 bit. expressed.
[0191]
In addition, although the case where the example of the acoustic data distribution system which concerns on the said embodiment was applied to the broadcasting system was demonstrated, the acoustic data distribution system which distributes acoustic data with a bit stream from a server via a transmission medium, such as the internet, to a terminal The bit stream output from the encoding device 300 is temporarily recorded on a recording medium such as an optical disk such as a CD or DVD, a semiconductor, or a hard disk, and is reproduced by the decoding apparatus 400 via this recording medium. You may apply to such an acoustic data delivery system.
[0192]
In the above embodiment, the LONG block is used. However, the SHORT block may be used, and the SHORT block may be processed in the same way as the LONG block.
[0193]
In the encoding process, tools such as Gain Control, TNS (TEMPORAL NOISE SHAPING), psychoacoustic model, M / S Stereo, Intensity Stereo, Prediction, block size switching, bit reservoir, etc. may be used. .
[0194]
In the above embodiment, the auxiliary information is generated based on the spectral data of the high frequency band separated by the data separation unit 330, but the value obtained by dequantizing the output of the first quantization unit 340 is set to the high frequency band. The auxiliary information may be generated based on the spectral data of the high frequency region.
[0195]
In addition, as auxiliary information, a scale factor such that the quantized value of spectrum data in each scale factor band in the high frequency part is “1”, a quantized value, characteristic spectral position information, and positive / negative of the spectrum. Although it implemented with the sign information showing a code | symbol, what combined these 2 or more is good also as auxiliary information. In this case, it is particularly effective if the auxiliary information is encoded in combination with the scale factor such as a coefficient representing the amplitude ratio and the position of the absolute maximum spectrum data. In the present embodiment, one auxiliary information is encoded for each scale factor band as the second encoded signal. However, one auxiliary information is encoded for each of two or more scale factor bands. Alternatively, two or more pieces of auxiliary information may be encoded in one scale factor band. Moreover, the auxiliary information in this Embodiment may encode auxiliary information for every channel, and may encode one auxiliary information with respect to two or more channels.
[0196]
In the present embodiment, the number of quantization units and the number of coding units in coding apparatus 300 are two, but the present invention is not limited to this, and three or more quantization units and decoding units are provided. May be.
[0197]
Further, in the present embodiment, two decoding units and two inverse quantization units are included in decoding apparatus 400, but the present invention is not limited to this, and three or more decoding units and dequantization units are not limited thereto. May be provided.
[0198]
Further, the above processing can be realized not only by hardware but also by software, and a configuration in which a part is realized by hardware and the rest is realized by software may be adopted.
In the above embodiment, the sampling frequency is 44.1 kHz. However, the sampling frequency may be 32 kHz, 48 kHz, or the like, and the separation frequency in the data separation unit 330 is set to a frequency other than 11.25 kHz. It may be changed.
[0199]
Furthermore, although the above embodiment has been described with respect to AAC, the same may be applied to encoding apparatuses and decoding apparatuses of other systems (for example, MP3, AC3, etc.).
[0200]
The encoding device may be configured as follows.
That is, an encoding device that encodes acoustic data, a cutout unit that continuously cuts out a number m2 of acoustic data that exceeds the required number m1 from the acoustic data string, and the acoustic data cut out by the cutout unit that cuts out the acoustic data. Conversion means for converting into spectrum data on the frequency axis, and m2 spectrum data obtained by the conversion are separated into the m1 low-frequency spectrum data and (m2-m1) high-frequency spectrum data. Separation means, low-band coding means for quantizing and coding the separated low-frequency spectrum data, and auxiliary information indicating characteristics of the high-frequency spectrum from the separated high-frequency spectrum data Auxiliary information generating means for generating, a high band encoding means for encoding the generated auxiliary information, and a code obtained by the low band encoding means It may be characterized in that the sign obtained in the serial high frequency portion encoding means synthesizes and an output means for outputting.
[0201]
In this case, the auxiliary information generation means calculates a normalization coefficient for the spectrum data divided into a plurality of groups, such that the value obtained by quantizing the peak spectrum data in each group in the high frequency region becomes a constant value. In addition, the calculated normalization coefficient may be generated as the auxiliary information.
[0202]
Further, the auxiliary information generation means, for the spectrum data divided into a plurality of groups, quantize the spectrum data that becomes a peak in each group of the high frequency region using a normalization coefficient common to each group, The quantized value is generated as the auxiliary information.
[0203]
Further, the auxiliary information generating means generates, as the auxiliary information, the frequency position of the spectrum data that becomes a peak in each group of the high frequency part for the spectrum data divided into a plurality of groups. I can do it.
[0204]
Further, the spectrum data is an MDCT coefficient, and the auxiliary information generating means assigns a sign indicating whether the spectrum data is positive or negative at a predetermined frequency position in a high frequency region for the spectrum data divided into a plurality of groups. It can be set as the characteristic characterized by producing | generating as.
[0205]
In addition, the auxiliary information generation means includes, for the spectrum data divided into a plurality of groups, information for specifying a spectrum in a low frequency portion that most closely approximates a spectrum in the group in each high frequency region group. It can be set as the characteristic characterized by producing | generating as information. In this case, the auxiliary information generating means is configured such that the distance on the frequency axis from the group break to the peak of the high band spectrum in the group and the frequency from the low band group break to the peak of the low band spectrum. It can be set as the structure characterized by pinpointing the low frequency part spectrum from which the difference with the distance on an axis | shaft becomes the minimum. Further, the auxiliary information generating means identifies a spectrum in a low frequency region where a difference value is minimized when an energy difference is taken with the same frequency width as a spectrum in the group in the high frequency region. Can be configured. Further, the information specifying the spectrum of the low-frequency part may be a number specifying the group of the specified low-frequency spectrum.
[0206]
The auxiliary information generating means may generate a predetermined coefficient representing the amplification ratio of the amplitude of the high band spectrum as the auxiliary information.
[0207]
Further, the output means further converts the data encoded by the low frequency band encoding means into an encoded audio stream defined in a predetermined format, and is an area in the encoded audio stream. A stream output unit that stores and outputs the data encoded by the high frequency band encoding means in an area that is not restricted by the encoding protocol can be used. In this case, the stream output unit may be configured to write information representing f1 Hz as a sampling frequency.
[0208]
Further, the output means further converts the data encoded by the low frequency band encoding means into an encoded acoustic stream defined in a predetermined format and is encoded by the high frequency band encoding means. A second stream output unit that stores and outputs the data in a stream different from the encoded audio stream can be provided.
[0209]
It is realized as a communication system comprising the encoding device and the decoding device of this modification, or realized as an encoding method and communication method using characteristic means constituting the encoding device and the communication system as steps. Can be realized as an encoding program that causes the CPU to execute characteristic means and steps constituting the encoding device, or can be realized as a computer-readable recording medium on which these programs are recorded. Needless to say.
[0210]
【The invention's effect】
As is clear from the above description, the encoding device according to the present invention is an encoding device that encodes acoustic data, and includes a cutting-out unit that cuts out a predetermined number of pieces of acoustic data from the acoustic data sequence, and a cutting-out unit. The conversion means for converting the acquired acoustic data into spectral data on the frequency axis, and the spectral data obtained by the conversion are separated into low-frequency spectral data up to f1 Hz and high-frequency spectral data higher than f1 Hz. Separation means, low-band coding means for quantizing and coding the separated low-frequency spectrum data, and auxiliary information indicating characteristics of the high-frequency spectrum from the separated high-frequency spectrum data Auxiliary information generating means for generating, a high frequency band encoding means for encoding the generated auxiliary information, and a code obtained by the low frequency band encoding means Output means for synthesizing and outputting the code obtained by the high frequency band encoding means, wherein f1 is less than or equal to half the sampling frequency f2 when the acoustic data string is created, To do.
[0211]
In the encoding device of the present invention, the converting means outputs a large number of low-frequency part spectrum data of f1 or less from the acoustic data sequence cut out by the cut-out means, and simultaneously outputs high-frequency part spectrum data exceeding f1. Then, the low-frequency spectrum data separated by the separating means is quantized and encoded, and the high-frequency spectrum data is encoded into auxiliary information representing the characteristics of the high-frequency portion of the frequency, The encoding means encodes the generated auxiliary information.
[0212]
Therefore, in a range where the total amount of information does not increase significantly, the low-frequency part is equivalent to downsampling, and the acoustic signal can be encoded with high quality so that the high-frequency part can be reproduced.
[0213]
Here, f1 is f2 / 4, the converting means converts the acoustic data into spectral data of 0 to 2 × f1 Hz, and the separating means includes low-frequency spectral data of 0 to f1 Hz and f1 ˜2 × f1 Hz high-frequency spectrum data, or the low-frequency spectrum data up to f1 Hz is composed of n spectral data, and the clipping means is 2 × n The number of pieces of acoustic data necessary to generate spectrum data is cut out, the converting unit converts the cut out acoustic data into 2 × n pieces of spectral data, and the separating unit uses n pieces of low-frequency spectrums. Data and n high-frequency spectrum data are separated, or the cut-out means includes n acoustic data corresponding to one frame which is a unit of encoding. And 2 × n pieces of spectrum data that are combined with n / 2 pieces of sound data belonging to each of two frames adjacent to the frame, and the converting means cuts out 2 × n pieces of sound data. Alternatively, the conversion may be performed by MDCT and converted to a spectrum of 0 to 2 × f1 Hz including 2 × n spectrum data.
[0214]
Furthermore, a decoding device according to the present invention is a decoding device that decodes encoded data input via a recording medium or a transmission medium, and includes a low-band encoded data included in the encoded data and a high-frequency encoded data. Extraction means for extracting each of the band encoded data, and the low band encoded data extracted by the extraction means are decoded and inverse quantized to output spectrum data of the low band below the frequency f1 Low-frequency dequantizing means, and auxiliary information decoding means for generating auxiliary information representing characteristics of high-frequency spectrum data by decoding high-frequency data extracted by the extracting means, A high frequency band inverse quantization means for outputting high band spectrum data based on the auxiliary information generated by the auxiliary information decoding means, and a low frequency band spectrum output by the low frequency band inverse quantization means. Synthesis means for synthesizing the data and the high-frequency spectrum data output by the high-frequency inverse quantization means, and inverse transform for inversely transforming the spectrum data synthesized by the synthesis means into acoustic data on the time axis And acoustic data output means for outputting the acoustic data inversely transformed by the inverse transform means in time order.
[0215]
In the decoding apparatus of the present invention, the extraction means extracts low-frequency encoded data and high-frequency encoded data from the input encoded data, and the low-frequency inverse quantization means has a frequency f1 or less. The spectrum data of the low frequency part of is output. The auxiliary information decoding means decodes the auxiliary information, and the high band inverse quantization means outputs high band spectrum data based on the auxiliary information. Therefore, it is possible to decode the amount of information that is significantly increased compared to the conventional amount from the same small amount of information as in the past, and to decode a high-quality sound signal.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a functional configuration of a broadcast system according to an embodiment of the present invention.
FIG. 2 is a diagram showing a change in state of an acoustic signal processed in the encoding device shown in FIG.
FIG. 3 is a flowchart showing an operation in a scale factor determination process of the first quantization unit shown in FIG. 1;
4 is a flowchart showing another operation in the scale factor determination process of the first quantization unit shown in FIG. 1; FIG.
FIG. 5 is a spectrum waveform diagram showing a specific example of auxiliary information (scale factor) generated by the second quantizing unit shown in FIG. 1;
6 is a flowchart showing an operation in auxiliary information (scale factor) calculation processing of the second quantization unit shown in FIG. 1; FIG.
7 is a diagram illustrating a position in a bit stream in which auxiliary information is stored by the stream output unit illustrated in FIG. 1. FIG.
FIG. 8 is a diagram illustrating another example when the stream output unit illustrated in FIG. 1 stores auxiliary information.
9 is a diagram showing a comparison of processing between the encoding device shown in FIG. 1 and a conventional example 1. FIG.
10 is a diagram showing a comparison of processing between the encoding device shown in FIG. 1 and a conventional example 2. FIG.
11 is a diagram showing a comparison of spectral data and characteristics between the encoding device shown in FIG. 1 and conventional examples 1 and 2. FIG.
12 is a flowchart showing a procedure in which a low-frequency part 1024 spectrum is copied in a forward direction to a high-frequency part by the second inverse quantization unit shown in FIG. 1;
13 is a flowchart showing a procedure of copying a low frequency band 1024 spectrum to a high frequency band in the reverse direction of the frequency axis direction by the second inverse quantization unit shown in FIG. 1;
14 is a spectrum waveform diagram showing a specific example of other auxiliary information (quantized value) generated by the second quantization unit shown in FIG. 1; FIG.
15 is a flowchart showing an operation in another auxiliary information (quantization value) calculation process of the second quantization unit shown in FIG. 1;
FIG. 16 is a spectrum waveform diagram showing a specific example of other auxiliary information (position information) generated by the second quantization unit shown in FIG. 1;
FIG. 17 is a flowchart showing an operation in another auxiliary information (position information) calculation process of the second quantization unit shown in FIG. 1;
FIG. 18 is a spectrum waveform diagram showing a specific example of other auxiliary information (sign information) generated by the second quantization unit shown in FIG. 1;
FIG. 19 is a flowchart showing an operation in another auxiliary information (sign information) calculation process of the second quantization unit shown in FIG. 1;
20 is a spectrum waveform diagram showing an example of a method for creating other auxiliary information (copy information) generated by the second quantization unit shown in FIG. 1; FIG.
FIG. 21 is a flowchart showing an operation in another auxiliary information (copy information) calculation process of the second quantization unit shown in FIG. 1;
22 is a spectrum waveform diagram showing a second example of a method for creating other auxiliary information (copy information) generated by the second quantization unit shown in FIG. 1; FIG.
FIG. 23 is a flowchart showing an operation in a second calculation process of other auxiliary information (copy information) of the second quantization unit shown in FIG. 1;
FIG. 24 is a block diagram showing a configuration of a conventional AAC encoding apparatus and decoding apparatus.
[Explanation of symbols]
1 Broadcasting system
300 Encoder
305 A / D converter
310 Acoustic data input unit
320 Conversion unit
330 Data separator
340 First quantization unit
345 Second quantization unit
350 1st encoding part
355 Second encoding unit
390 stream output unit
400 Decoding device
410 Stream input section
420 First decoding unit
425 Second decoding unit
430 First inverse quantization unit
435 Second inverse quantization unit
440 Inverse Quantized Data Synthesizer
480 Inverse conversion unit
490 Sound data output unit
495 D / A converter

Claims

An encoding device for encoding acoustic data,
A cutting means for cutting out a fixed number of continuous acoustic data from the acoustic data string,
Conversion means for converting the cut out acoustic data into spectrum data on the frequency axis;
Separating means for separating the spectral data obtained by the conversion into low-frequency spectral data up to the f1 Hz and high-frequency spectral data higher than the f1 Hz;
A low frequency band encoding means for quantizing and encoding the separated low frequency band spectrum data;
Auxiliary information generating means for generating auxiliary information indicating the characteristics of the high frequency part frequency spectrum from the separated high frequency part spectrum data,
High frequency band encoding means for encoding the generated auxiliary information;
Output means for combining and outputting the code obtained by the low frequency band encoding means and the code obtained by the high frequency band encoding means,
The f1 is less than or equal to half the sampling frequency f2 when the acoustic data string is created,
The auxiliary information generating means, respectively per the higher band spectrum data and the low frequency band spectrum data divided into a plurality of groups, each group of the high frequency band, with a spectrum that is most approximate to the spectrum in the group An encoding device characterized in that information specifying a group of low-frequency spectrum data is generated as the auxiliary information.

The auxiliary information generating means includes a distance on the frequency axis from the group break to the high-frequency spectrum peak in the group, and a frequency-axis from the low-band group break to the low-frequency spectrum peak. The encoding device according to claim 1, wherein a low-frequency spectrum having a minimum difference from the distance is specified.

The auxiliary information generation means specifies a spectrum in a low frequency region where a difference value is minimum when an energy difference is taken with the same frequency width as a spectrum in the group in the high frequency region. The encoding device according to 1.

The encoding device according to claim 1, wherein the information for specifying the spectrum of the low-frequency part is a number for specifying the group of the specified low-frequency part spectrum.

A decoding device that decodes encoded data input via a recording medium or a transmission medium,
Extraction means for extracting low-band encoded data and high-band encoded data included in the encoded data,
Low-frequency band inverse quantization means for decoding low-frequency band encoded data extracted by the extraction means and dequantizing to output low-frequency spectrum data having a frequency of f1 Hz or less;
Auxiliary information decoding means for generating auxiliary information representing the characteristics of the high frequency band spectrum data by decoding the high frequency band data extracted by the extraction means,
High-band inverse quantization means for outputting high-band spectrum data based on the auxiliary information generated by the auxiliary information decoding means;
Combining means for synthesizing the low-frequency part spectrum data output by the low-frequency part inverse quantization means and the high-frequency part spectral data output by the high-frequency part inverse quantization means;
Inverse conversion means for inversely converting spectral data synthesized by the synthesis means into acoustic data on a time axis;
Acoustic data output means for outputting the acoustic data inversely transformed by the inverse transformation means in order of time,
Low-frequency auxiliary information, respectively per the high frequency portion spectral data and the low frequency band spectrum data divided into a plurality of groups, with each group of the high frequency band, the spectrum which is most approximate to the spectrum in the group Information identifying a group of partial spectral data ,
The high frequency band inverse quantization means generates predetermined noise in each group of high frequency bands based on the auxiliary information, and adds the generated noise to the spectrum data to generate high frequency spectrum data. Decoding device to perform.

An acoustic data distribution system for distributing acoustic data compression-encoded into a low bit rate bit stream via a recording medium or a transmission medium,
An acoustic data distribution system comprising: the encoding device according to claim 1; and the decoding device according to claim 5.

An encoding method for encoding acoustic data, comprising:
A step of cutting out a certain number of continuous acoustic data from the acoustic data sequence;
A conversion step of converting the cut out acoustic data into spectrum data on the frequency axis;
A separation step of separating the spectral data obtained by the conversion into low-frequency spectral data up to the f1 Hz and high-frequency spectral data higher than the f1 Hz;
A low band encoding step for quantizing and encoding the separated low band spectrum data;
Auxiliary information generating step for generating auxiliary information indicating the characteristics of the high frequency part frequency spectrum from the separated high frequency part spectrum data,
A high band encoding step for encoding the generated auxiliary information;
An output step of combining and outputting the code obtained in the low frequency band encoding step and the code obtained in the high frequency band encoding step;
The f1 is less than or equal to half the sampling frequency f2 when the acoustic data string is created,
In the auxiliary information generation step, each of the high-frequency part of each of the high-frequency part spectrum data and the low-frequency part spectral data divided into a plurality of groups has a spectrum that most closely approximates the spectrum in the group. The information which specifies the group of low frequency part spectrum data is generated as the auxiliary information. An encoding method characterized by things.

A decoding method for decoding encoded data input via a recording medium or a transmission medium,
An extraction step for extracting low-band encoded data and high-band encoded data included in the encoded data,
A low-band inverse quantization step for outputting low-band spectrum data having a frequency of f1 Hz or less by decoding and inverse-quantizing the low-band encoded data extracted by the extraction step;
An auxiliary information decoding step for generating auxiliary information representing the characteristics of the high-frequency spectrum data by decoding the high-frequency data extracted in the extraction step;
A high-band inverse quantization step for outputting high-band spectral data based on the auxiliary information generated by the auxiliary information decoding step;
A synthesis step of synthesizing the low-frequency part spectrum data output by the low-frequency part inverse quantization step and the high-frequency part spectrum data output by the high-frequency part inverse quantization step;
An inverse transformation step for inversely transforming the spectral data synthesized by the synthesis step into acoustic data on a time axis;
An acoustic data output step for outputting the acoustic data inversely transformed by the inverse transformation step in time order,
Low-frequency auxiliary information, respectively per the high frequency portion spectral data and the low frequency band spectrum data divided into a plurality of groups, with each group of the high frequency band, the spectrum which is most approximate to the spectrum in the group Information identifying a group of partial spectral data ,
In the high frequency band inverse quantization step, based on the auxiliary information, a predetermined noise is generated in each group of the high frequency band, and is added to the spectrum data to generate high frequency spectrum data. Decryption method to do.

A program used in an encoding device that encodes an input acoustic signal,
A program that causes a computer to function as each unit included in the encoding device according to claim 1.

A program used in a decoding device for decoding encoded data input via a recording medium or a transmission medium,
A program that causes a computer to function as each unit included in the decoding device according to claim 5.

A computer-readable recording medium on which the program according to claim 9 is recorded.

A computer-readable recording medium on which the program according to claim 10 is recorded.