JP3923783B2

JP3923783B2 - Encoding device and decoding device

Info

Publication number: JP3923783B2
Application number: JP2001337869A
Authority: JP
Inventors: 孝祐西尾; 峰生津島; 直也田中; 武志則松
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2001-11-02
Filing date: 2001-11-02
Publication date: 2007-06-06
Anticipated expiration: 2021-11-02
Also published as: JP2003140692A

Abstract

PROBLEM TO BE SOLVED: To provide a coding device and a decoding device for realizing coding and decoding of a wide range of digital acoustic signals. SOLUTION: The decoding device 100 is provided with: a 1st coding part 132 for coding a low frequency region among the spectral data obtained by converting an input acoustic signal for a certain time; a 2nd quantization part 133 for generating auxiliary information expressing the features of a high frequency region among the spectral data obtained by converting the input acoustic signal for a certain time; a 2nd coding part 134 for coding the generated auxiliary information; and a stream output part 140 or outputting the data coded by the 1st coding part 132, and the data coded by the 2nd coding part 134.

Description

【０００１】
【発明の属する技術分野】
本発明は、デジタル音響データの高音質符号化及び復号化技術に関する。
【０００２】
【従来の技術】
現在、音声データを圧縮符号化する様々な音声圧縮方式が開発されている。ＭＰＥＧ−２ＡｄｖａｎｃｅｄＡｕｄｉｏＣｏｄｉｎｇ（以下、ＡＡＣと略称する）もその方式の一つである。ＡＡＣの詳細は、「ＩＳＯ／ＩＥＣ１３８１８−７（ＭＰＥＧ−２ＡｄｖａｎｃｅｄＡｕｄｉｏＣｏｄｉｎｇ、ＡＡＣ）」という規格書に記載されている。
【０００３】
先ず従来の符号化及び復号化手順を、図２２を用いて説明する。図２２は、従来のＭＰＥＧ−２ＡＡＣ方式による符号化装置３００及び復号化装置４００の構成を示すブロック図である。符号化装置３００は、入力された音響信号をＭＰＥＧ−２ＡＡＣ符号化方式に基づいて圧縮符号化する装置であって、音響信号入力部３１０、変換部３２０、量子化部３３１、符号化部３３２及びストリーム出力部３４０から構成される。
【０００４】
音響信号入力部３１０は、例えば、４４．１ｋＨｚのサンプリング周波数で、入力信号であるデジタル音響データを連続した１０２４サンプルごとに切り出す。なお、この１０２４サンプルの符号化単位を「フレーム」という。
【０００５】
変換部３２０は、音響信号入力部３１０によって切り出された時間軸上のサンプルデータを、ＭＤＣＴによって周波数軸上のスペクトルデータに変換する。なお、この時点で変換された１０２４サンプルのスペクトルデータは、複数のグループに分類される。前記各グループは、複数のグループのそれぞれに、１サンプル以上のスペクトルデータが含まれるように設定される。また、この各グループは、人間の聴覚におけるクリティカルバンドを擬似している。各グループのそれぞれを「スケールファクターバンド」という。
【０００６】
量子化部３３１は、変換部３２０から得られたスペクトルデータを所定ビット数で量子化する。ＭＰＥＧ−２ＡＡＣでは、スケールファクターバンドごとに１つの正規化係数を用いて、スケールファクターバンド内のスペクトルデータを量子化する。この正規化係数のことを「スケールファクター」という。また、各スペクトルデータを各スケールファクターで量子化した結果を「量子化値」という。符号化部３３２は、量子化部３３１で量子化されたデータ、すなわち、各スケールファクターと、それを用いて量子化されたスペクトルデータとをストリーム用のフォーマットにハフマン符号化する。この際に、符号化部３３２は、１フレームにおいて前後に隣接するスケールファクターバンドのスケールファクターの差分を求め、その差分と先頭スケールファクターバンドのスケールファクターとをハフマン符号化する。
【０００７】
ストリーム出力部３４０は、符号化部３３２から得られた符号化信号を、ＭＰＥＧ−２ＡＡＣビットストリームに変換し、出力する。符号化装置３００から出力されたビットストリームは、伝送媒体を介して復号化装置４００に伝送されたり、ＣＤやＤＶＤ等の光ディスク、半導体、ハードディスク等の記録媒体に記録されたりする。
【０００８】
復号化装置４００は、符号化装置３００によって符号化されたビットストリームを復号化する装置であって、ストリーム入力部４１０、復号化部４２１、逆量子化部４２２、逆変換部４３０及び音響信号出力部４４０から構成される。
【０００９】
ストリーム入力部４１０は、符号化装置３００によって符号化されたビットストリームを伝送媒体を介して、あるいは、記録媒体から再生して入力し、入力したビットストリームから符号化信号を取り出す。復号化部４２１は、取り出された符号化信号をストリーム用のフォーマットから量子化データに復号化する。
【００１０】
逆量子化部４２２は、復号化部４２１で復号化された量子化データを逆量子化する。ＭＰＥＧ−２ＡＡＣでは、ハフマン符号化されたデータを復号化する。逆変換部４３０は、逆量子化部４２２で得られた周波数軸上のスペクトルデータを、時間軸上のサンプルデータに変換する。ＭＰＥＧ−２ＡＡＣでは、ＩＭＤＣＴ（ＩｎｖｅｒｓｅＭｏｄｉｆｉｅｄＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）を用いて変換する。音響信号出力部４４０は、逆変換部４３０で得られた時間軸上のサンプルデータを順次組み合わせ、デジタル音響データとして出力する。
【００１１】
【発明が解決しようとする課題】
上記方式においては、音響データの符号化において、入力された音響データの音質がどの程度保持されるかを表す１つの目安として、符号化後の再生帯域がある。例えば入力信号のサンプリング周波数が４４．１ｋＨｚの時、再生帯域は２２．０５ｋＨｚとなり、この２２．０５ｋＨｚ分、又は２２．０５ｋＨｚに近い広帯域なデータを劣化させることなく効率的に符号化し、かつ符号化結果の全データを転送レートの範囲内で復号化装置に転送しきることによって、復号化装置において高音質な音響信号を得ることができる。すなわち、符号化装置側では高音質な符号化を達成することができる。しかし、再生帯域の広さはスペクトルデータの数に影響し、スペクトルデータの数は情報量に影響する。例えば入力信号のサンプリング周波数が４４．１ｋＨｚの時、１０２４サンプルのスペクトルデータが２２．０５ｋＨｚ分のデータに対応し、２２．０５ｋＨｚの再生帯域を確保するためには、１０２４サンプルのスペクトルデータ全てを伝送する必要がある。
【００１２】
ところが、携帯電話等の低転送レートの伝送路を考慮すると、実際に１０２４サンプルのスペクトルデータ全てを伝送することは、データ量が大きすぎて現実的ではない。つまり、転送レートに合わせたデータ量で、この再生帯域の全スペクトルデータを転送しようとすると、各周波数帯域に割り当てることができる情報量がわずかとなり、その結果、量子化ノイズによる影響が大きくなり、符号化による音質劣化を招く。
【００１３】
このため、ＭＰＥＧ−２ＡＡＣに限らず、多くの音響信号符号化方式においては、スペクトルデータに聴覚的重み付けを行い、優先度の低いデータは伝送しないことにより、効率的な音響信号の伝送を実現している。これに従えば、再生帯域に関しては、聴覚的に優先度の高い低域部の符号化精度を向上させるため、低域部の符号化情報に十分なデータ量を割り当て、優先度の低い高域部は伝送対象外とされる確率が高い。
【００１４】
しかしながら、ＭＰＥＧ−２ＡＡＣ方式においてはこのような工夫がなされているにもかかわらず、音響信号の符号化に対して、さらなる高品質化、圧縮効率の向上が求められている。つまり、低転送レートであっても、高域部の音響信号を伝送することの要望が高まってきている。
【００１５】
本発明の目的は、符号化後の情報量を大幅に増加させることなく音響信号の高音質な符号化及びその復号化を実現できる符号化装置及び復号化装置を提供することである。
【００１６】
【課題を解決するための手段】
上記目的を達成するために本発明の符号化装置は、入力された音響信号を符号化する符号化装置であって、一定時間分の入力音響信号を変換して得られる複数のグループに分けられたスペクトルデータから、前記各グループ内のスペクトルデータを正規化する正規化係数と、前記正規化係数を用いて前記各グループ内の前記各スペクトルデータを量子化して得られる量子化値と、前記各スペクトルデータの正負を表す正または負の符号と、前記各スペクトルデータの周波数軸上の位置とを含む４種類の情報で表された周波数の低域部データを符号化する第１符号化手段と、周波数高域部の前記各グループにおける前記スペクトルデータに近似した低域部スペクトルデータを特定する情報と、特定された前記低域部スペクトルデータを整形するための情報として、高域部スペクトルデータの特徴を、前記４種類の情報のうち、１種類以上３種類以下の情報で表した整形のための情報とを含む補助情報を生成する補助情報生成手段と、生成された前記補助情報を符号化する第２符号化手段と、前記第１符号化手段によって符号化されたデータと、前記第２符号化手段によって符号化されたデータとを出力する出力手段とを備えることを特徴とする。本発明の上記符号化装置において、補助情報生成手段は、一定時間分の入力音響信号を変換して得られる複数のグループに分けられたスペクトルデータのうち、周波数の高域部の特徴を、低域部より少ない情報で表した補助情報を生成し、第２符号化手段は、生成された前記補助情報を符号化する。
【００１７】
上記目的を達成するために本発明の復号化装置は、一定時間分の入力音響信号を変換して得られる複数のグループに分けられたスペクトルデータから、前記各グループ内のスペクトルデータを正規化するための正規化係数と、前記正規化係数を用いて前記各グループの前記各スペクトルデータを量子化して得られる量子化値と、前記各スペクトルデータの正負を表す正または負の符号と、前記各スペクトルデータの周波数軸上の位置とを含む４種類の情報で表された、周波数の低域部データを符号化して得られた第１符号化データと、周波数高域部の前記各グループにおける前記スペクトルデータに近似した低域部スペクトルデータを特定する情報と、特定された前記低域部スペクトルデータを整形するための情報として、高域部スペクトルデータの特徴を、前記４種類の情報のうち、１種類以上３種類以下の情報で表した整形のための情報とを含む補助情報を符号化して得られた第２符号化データとを含む符号化データを入力し、復号化する復号化装置であって、入力符号化データから前記第２符号化データを分離する符号化データ分離手段と、入力符号化データ中の前記第１符号化データを復号化し、周波数の低域部を表すスペクトルデータを出力する第１復号化手段と、入力された符号化データから分離された前記第２符号化データを復号化し、前記補助情報中の前記低域部スペクトルデータを特定する情報に基づいて、前記第１復号化手段によって出力された前記スペクトルデータのうち、特定された低域部スペクトルデータを高域部の前記各グループにコピーし、前記補助情報中の前記整形のための情報に基づいて、コピーされたスペクトルデータを整形することによって周波数の高域部を表すスペクトルデータを生成し、出力する第２復号化手段と、前記第１復号化手段によって出力されたスペクトルデータと、前記第２復号化手段によって出力されたスペクトルデータとを合成して変換し、時間軸上の音響信号として出力する音響信号出力手段とを備えることを特徴とする。本発明の上記復号化装置において、符号化データ分離手段は、入力符号化データから前記第２符号化データを分離し、第２復号化手段は、分離された前記第２符号化データを復号化して前記低域部スペクトルデータを特定する情報と整形のための情報とを含む前記補助情報を生成し、生成された前記補助情報に基づいて周波数の高域部を表すスペクトルデータを生成し、出力する。
【００１８】
なお、本発明は、本発明の符号化装置を備える送信装置と本発明の復号化装置を備える受信装置とからなる放送システムとして実現したり、それら符号化装置及び復号化装置の特徴的な構成要素を処理ステップとする符号化方法及び復号化方法として実現したり、それらステップをコンピュータに実行させるプログラムとして実現したりすることもできる。そして、そのプログラムをＣＤ−ＲＯＭ等のコンピュータ読み取り可能な記録媒体や通信路等の伝送媒体を介して流通させることができることは言うまでもない。
【００１９】
【発明の実施の形態】
以下、図面を参照しながら本発明の実施の形態における符号化装置１００及び復号化装置２００について詳細に説明する。また本発明の実施の形態においては、従来方式としてＭＰＥＧ−２ＡＡＣを例にとって説明を行う。図１は、本発明の実施の形態における符号化装置１００及び復号化装置２００の構成を示すブロック図である。
＜符号化装置１００＞
【００２０】
符号化装置１００は、入力された音響信号の低域部をＭＰＥＧ−２ＡＡＣ符号化方式に基づいて圧縮符号化するとともに、高域部の音響信号の特徴を表す補助情報を生成し、それを圧縮符号化して、前記低域部の符号化ビットストリームに組み込んで出力する装置であって、音響信号入力部１１０、変換部１２０、第１の量子化部１３１、第１の符号化部１３２、第２の量子化部１３３、第２の符号化部１３４及びストリーム出力部１４０から構成される。
【００２１】
音響信号入力部１１０は、周波数４４．１ｋＨｚのサンプリング周波数でサンプリングされたＭＰＥＧ−２ＡＡＣと同様の入力信号であるデジタル音響データを、約２２．７ｍｓｅｃ（１０２４サンプルごと）のサイクルで、その前後の５１２サンプルをオーバーラップさせて切り出す。
【００２２】
変換部１２０は、従来と同様、音響信号入力部１１０によって切り出された時間軸上のサンプルデータを、周波数軸上のスペクトルデータに変換する。ＭＰＥＧ−２ＡＡＣでは、ＭＤＣＴ（ＭｏｄｉｆｉｅｄＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）を用いて、入力信号１０２４点を前後５１２サンプルのデータとオーバーラップさせて２０４８サンプルの時間軸データを、２０４８サンプルのスペクトルデータに変換するが、ＭＤＣＴでは左右対称なスペクトルデータとなるため、片方の１０２４サンプルのみ符号化すればよい。
【００２３】
さらに、変換部１２０は、変換された１０２４サンプルのスペクトルデータを、それぞれ１サンプル以上（実用的には４の倍数）のスペクトルデータを含む複数のスケールファクターバンドに分類する。このスケールファクターバンドは、この規格において、各スケールファクターバンドに含まれるサンプル（スペクトルデータ）数が周波数に応じて定められており、低域部においては少数のサンプルごとに細かく区切られ、高域部になるほど多数のサンプルを含むよう大きく区切られている。ＭＰＥＧ−２ＡＡＣにおいては、１フレームのスペクトルデータに対応するスケールファクターバンドの数もサンプリング周波数に応じて定められている。例えば、サンプリング周波数が４４．１ｋＨｚの場合は、１フレームに含まれるスケールファクターバンドの数は４９個であり、４９個のスケールファクターバンドの中に１０２４サンプルのスペクトルデータが含まれている。一方、このように定められたスケールファクターバンドのうち、どのスケールファクターバンドを伝送するかは特に規定されておらず、伝送路の転送レートに応じて、最も好ましいスケールファクターバンドを選択して伝送すればよい。例えば、伝送路の転送レートが９６ｋｂｐｓの場合、１フレームのうちの低域部４０スケールファクターバンド（６４０サンプル）のみを選択して伝送するようにしてもよい。
【００２４】
なお、本実施の形態においては、変換部１２０が、変換後のスペクトルデータを、独自に定めた区切り方及び数のスケールファクターバンドに分類した場合について説明する。
【００２５】
第１の量子化部１３１は、変換部１２０の出力するスペクトルデータを入力し、入力されたスペクトルデータの低域部の各スケールファクターバンドにつき、それぞれスケールファクターを決定するとともに、決定したスケールファクターでそのスケールファクターバンド内のスペクトルを量子化し、量子化結果である量子化値を第１の符号化部１３２に出力する。例えばこの場合、入力信号のサンプリング周波数が４４．１ｋＨｚであるから、再生帯域は２２．０５ｋＨｚとなるが、このうちの低域部、例えば１１．０２５ｋＨｚ以下の帯域について、各スケールファクターにおいてスペクトルデータから得られる量子化値が例えば４ビット以下の数値で表されるように、スケールファクターを計算し、そのスケールファクターを用いてスケールファクターバンド内の各スペクトルを正規化した後に量子化する。
【００２６】
第１の符号化部１３２は、第１の量子化部１３１で量子化されたデータ、すなわち、全スペクトルデータのうち、低域部側の５１２サンプルに対応する各スケールファクターバンド内の量子化値及びその量子化に用いられたスケールファクターなどを、第１の符号化信号としてハフマン符号化して所定のストリーム用のフォーマットに変換する。
【００２７】
第２の量子化部１３３は、変換部１２０の出力するスペクトルデータを入力し、第１の量子化部１３１において量子化されない帯域、すなわち１１．０２５ｋＨｚを超える高域部の補助情報のみを計算して出力する。
【００２８】
補助情報とは、高域部のスペクトルデータに基づいて計算され、従来の方式であれば伝送されない高域部の音響信号を表す簡略化された情報をいう。つまり、一定時間分の入力音響信号を変換して得られるスペクトルデータのうち、周波数の高域部の特徴を表す情報であって、具体的には、高域部のスケールファクターバンド内で絶対最大スペクトルデータ（絶対値が最大となるスペクトルデータ）の量子化値を１にするような、スケールファクターバンドごとのスケールファクター及びその量子化値であり、また、各スケールファクターバンド内での絶対最大スペクトルデータの位置であり、高域部の各スケールファクターバンドに共通するスケールファクターを定めた場合のスケールファクターバンドごとの絶対最大スペクトルデータの量子化値であり、高域部においてあらかじめ定めた位置のスペクトルの正負を示す符号であり、さらに、高域部のスペクトルに相似した低域部のスペクトルをコピーして高域部のスペクトルを表す場合のコピー方法を示す情報などである。またさらに、前記のような補助情報中に、高域部のみに限らず、低域部から高域部に渡って混入されているホワイトノイズなどの振幅を示す雑音情報などを加えておいてもよい。
【００２９】
第２の符号化部１３４は、第２の量子化部１３３が出力した補助情報を所定のストリーム用のフォーマットにハフマン符号化し、第２の符号化情報として出力する。
【００３０】
ストリーム出力部１４０は、第１の符号化部１３２から出力される第１の符号化信号にヘッダ情報及びその他必要に応じた副情報を付加してＭＰＥＧ−２ＡＡＣの符号化ビットストリームに変換し、かつ第２の符号化部１３４から出力された第２の符号化信号を、上記ビットストリーム中の従来の復号化装置では無視される又はその動作が規定されていない領域に格納する。
【００３１】
具体的には、ストリーム出力部１４０は、第２の符号化部１３４から出力される符号化信号を、ＭＰＥＧ−２ＡＡＣの符号化ビットストリームにおけるＦｉｌｌＥｌｅｍｅｎｔやＤａｔａＳｔｒｅａｍＥｌｅｍｅｎｔに格納する。符号化装置１００から出力されたビットストリームは伝送媒体を介して復号化装置２００に伝送されたり、ＣＤやＤＶＤ等の光ディスク、半導体、ハードディスク等の記録媒体に記録されたりする。
【００３２】
なお、ＭＰＥＧ−２ＡＡＣでは入力の音響信号に応じて、ＭＤＣＴの変換長を変更することができる。変換長が２０４８サンプルのものをＬＯＮＧブロック、変換長が２５６サンプルのものをＳＨＯＲＴブロックといい、これらをまとめてブロックサイズという。本説明は特に断りのない限りＬＯＮＧブロックについて行うが、ＳＨＯＲＴブロックにおいても同様の処理が行える。
【００３３】
なおまた実際のＭＰＥＧ−２ＡＡＣの符号化処理では、ＧａｉｎＣｏｎｔｒｏｌやＴＮＳ（ＴＥＭＰＯＲＡＬＮＯＩＳＥＳＨＡＰＩＮＧ）、聴覚心理モデル、Ｍ／ＳＳｔｅｒｅｏ、ＩｎｔｅｎｓｉｔｙＳｔｅｒｅｏ、Ｐｒｅｄｉｃｔｉｏｎ等のツール利用、及びブロックサイズの切り替え、ビットリザーバー等を使用する場合がある。
＜復号化装置２００＞
【００３４】
復号化装置２００は、入力された符号化ビットストリームから前記補助情報に基づいて高域部の付加された広帯域の音響データを復元する装置であって、ストリーム入力部２１０、第１の復号化部２２１、第１の逆量子化部２２２、第２の復号化部２２３、第２の逆量子化部２２４、逆量子化データ合成部２２５、逆変換部２３０及び音響信号出力部２４０から構成される。
【００３５】
ストリーム入力部２１０は、伝送媒体を介したり、記録媒体から再生したりして符号化装置１００において生成されたビットストリームを入力し、従来の復号化装置が復号するべき領域に格納されている第１の符号化信号と、従来の復号化装置が無視するかまたはその情報に対する動作が規定されていない領域に格納されている第２の符号化信号とを取り出して、それぞれ第１の復号化部２２１と第２の復号化部２２３とに出力する。
【００３６】
第１の復号化部２２１は、ストリーム入力部２１０の出力する第１の符号化信号を入力し、ハフマン符号化されたデータをストリーム用のフォーマットから量子化データに復号化する。第１の逆量子化部２２２は、第１の復号化部２２１により復号化された量子化データを逆量子化し、低域部のスペクトルデータを出力する。ここで、第１の逆量子化部２２２が出力するスペクトルデータのサンプル数は５１２サンプル（最大サンプル数は１０２４）であり、これらは１１．０２５ｋＨｚの再生帯域（最大再生帯域２２ . ０５ｋＨｚ）を表す。
【００３７】
第２の復号化部２２３は、ストリーム入力部２１０の出力する第２の符号化信号を入力し、入力された第２の符号化信号を復号して補助情報を出力する。第２の逆量子化部２２４は、第１の逆量子化部２２２から出力されたスペクトルデータをもとにあらかじめ決められた手順でノイズ、例えば、低域部スペクトルデータの一部または全部のコピー、あるいはホワイトノイズまたはピンクノイズなどを生成し、かつ第２の復号化部２２３の出力する補助情報をもとに上記ノイズを整形して、高域部のスペクトルデータを出力する。
【００３８】
具体的には、第２の逆量子化部２２４は、第１の逆量子化部２２２によって出力される低域部のスペクトルデータを高域部にコピーしておき、高域部のスケールファクターバンド毎に、バンド内にコピーされたスペクトルデータの絶対最大値と、量子化値「１」を補助情報に記述されているそのバンドに対応するスケールファクター値を用いて逆量子化した値との比率を係数として、バンド内の各スペクトルデータに乗じることによって高域部のスペクトルを復元する。さらに、第２の逆量子化部２２４は、あらかじめ所定の振幅をもつホワイトノイズを生成しておき、補助情報内の雑音情報に従って振幅を調整し、復元された前記スペクトルに加算して高域部のスペクトルデータを出力する。
【００３９】
逆量子化データ合成部２２５は、第１の逆量子化部２２２の出力するスペクトルデータと第２の逆量子化部２２４の出力するスペクトルデータとを合成する。逆変換部２３０は、ＭＰＥＧ−２ＡＡＣに従って、逆量子化データ合成部２２５から出力された周波数軸上のスペクトルデータを、ＩＭＤＣＴを用いて時間軸上の１０２４サンプルのサンプルデータに変換する。音響信号出力部２４０は、逆変換部２３０で得られた時間軸上のサンプルデータを順次組み合わせ、デジタル音響データとして出力する。
【００４０】
以上のように本実施の形態によれば、低域部は従来の符号化を行い、高域部を極めて少ない情報量で符号化を行うことにより、情報量の合計が、従来と比べて大幅に増加しない範囲で高品質の音響信号を符号化することができる。
【００４１】
また本実施の形態における符号化装置１００及び復号化装置２００の構成は、従来の符号化装置３００に第２の量子化部１３３及び第２の符号化部１３４を追加し、かつ従来の復号化装置４００に第２の復号化部２２３及び第２の逆量子化部２２４を追加しただけであるため、既存の符号化装置３００及び復号化装置４００の構成を大幅に変更することなく実現できるという効果がある。
【００４２】
また、本実施の形態のおける符号化装置３００が生成したビットストリームは、従来の復号化装置４００でも復号することができるという効果がある。
なお本実施の形態においてはＭＰＥＧ−２ＡＡＣを例に挙げて説明したが、他の音響符号化方式にも適用できるし、既存しない新しい音響符号化方式にも適用できることは明らかである。
【００４３】
また本実施の形態においては、第２の量子化部１３３における入力データは、変換部１２０から出力されるスペクトルデータのみとしたが、これに限ったものでなくてもよく、第１の量子化部１３１の出力を逆量子化した値を別途入力してもよい。
図２は、本実施の形態の他の構成例である符号化装置１０１及び復号化装置２００の構成を示すブロック図である。なお、図１と同様の構成についてはすでに説明しているので、図１と同一の符号を付し、説明を省略する。
【００４４】
符号化装置１０１が符号化装置１００と異なる点は、新たに、逆量子化部１５２を備えることである。この符号化装置１０１において、第１の量子化部１５１は、変換部１２０によって出力された１０２４点のスペクトルデータすべてを量子化し、その量子化結果を逆量子化部１５２に出力するとともに、そのうちの低域部５１２点の量子化結果を第１の符号化部１３２に出力する。
【００４５】
逆量子化部１５２は、第１の量子化部１５１によって一旦、量子化された量子化値を逆量子化し、逆量子化結果であるスペクトルデータを第２の量子化部１５３に出力する。
第２の量子化部１５３は変換部１２０からのスペクトルデータを入力せず、逆量子化部１５２の逆量子化結果であるスペクトルデータを入力し、入力されたスペクトルデータに基づいて高域部の補助情報を生成する。
【００４６】
なお、ここでは第２の量子化部１５３は変換部１２０からのスペクトルデータを入力せず、逆量子化部１５２からのスペクトルデータに基づいて高域部の補助情報を生成するとしたが、本発明はこの例に限定されず、第２の量子化部１５３は、ある部分については変換部１２０からのスペクトルデータを入力し、ある部分については逆量子化部１５２からのスペクトルデータを入力するとしてもよい。
【００４７】
図３は、図１に示した符号化装置１００において処理される音響信号の状態変化を示す図である。図３（ａ）は、図１に示した音響信号入力部１１０によって切り出される時間軸上の１０２４のサンプルデータを示す波形図である。図３（ｂ）は、時間軸上のサンプルデータが図１に示した変換部１２０のＭＤＣＴによって変換された後の周波数軸上のスペクトルデータを示す波形図である。なお、図３（ａ）及び図３（ｂ）において、サンプルデータ及びスペクトルデータはアナログ波形で示されているが、実際には、いずれもデジタル信号である。以下の波形図においても同様である。
【００４８】
音響信号入力部１１０には、４４．１ｋＨｚでサンプリングされたデジタル音響信号が入力される。音響信号入力部１１０は、この入力信号から毎１０２４サンプルを切り出すタイミングでその前後５１２サンプルをオーバーラップさせて切り出し、変換部１２０に出力する。変換部１２０は、合計２０４８サンプルのデータをＭＤＣＴするが、ＭＤＣＴによって得られるスペクトルが左右対称の波形となるため、その半分の１０２４サンプルに対応する図３（ｂ）に示すようなスペクトルデータを生成する。
【００４９】
図３（ｂ）に示すスペクトルデータは、縦軸に、周波数スペクトルの値、すなわち、図３（ａ）において１０２４サンプルの電圧値で表されていた音響信号の周波数成分の量（大きさ）を、前記サンプル数に対応する１０２４点で表している。また、符号化装置１００に入力されるデジタル音響信号のサンプリング周波数が４４．１ｋＨｚであるので、スペクトルデータの再生帯域は、２２．０５ｋＨｚとなっている。さらに、ＭＤＣＴによって得られるスペクトルは図３（ｂ）に示すように負の値をとる場合があるので、ＭＤＣＴによって得られたスペクトルを符号化する場合には、スペクトルの正負の符号も合わせて符号化する必要がある。以下では、符号化の符号との混同を避けるため、スペクトルデータの正負の符号を表す情報を「サイン情報」という。
【００５０】
図４は、図１に示したストリーム出力部１４０によって補助情報が格納されるビットストリーム中の位置を示す図である。図において、高域部のスペクトルを表す補助情報は、符号化された後、第２の符号化信号としてビットストリーム中の音響符号化信号として認識されない領域に格納される。
【００５１】
図４（ａ）において斜線で示す部分は、例えば、ビットストリームのデータ長を合わせるために「０」で埋められる領域（ＦｉｌｌＥｌｅｍｅｎｔ）であって、この領域に、高域部のスペクトルを表す補助情報、すなわち第２の符号化信号が格納されていても、従来の復号化装置４００では復号化すべき符号化信号とは認識されず、無視される。
【００５２】
また、図４（ｂ）において斜線で示す部分は、例えば、ＤａｔａＳｔｒｅａｍＥｌｅｍｅｎｔ（ＤＳＥ）という領域であって、この領域は、将来の拡張のためＭＰＥＧ−２ＡＡＣの規格によってビット長などの物理的構造だけが規定された領域である。この領域は、ＦｉｌｌＥｌｅｍｅｎｔと同様、ここに高域部のスペクトルを表す補助情報が格納されていても、従来の復号化装置４００では無視されるか又はそのデータが読み取られたとしても読み取られたデータに対する復号化装置４００の動作が規定されていない領域である。
【００５３】
また、以上ではＭＰＥＧ−２ＡＡＣの規格によって従来の復号化装置４００では無視されるようなビットストリーム中の領域に第２の符号化信号を格納するとしたが、それ以外にも、ヘッダ情報の所定の位置に組み込んでもよいし、第１の符号化信号中の所定の位置に第２の符号化信号を組み込んでもよいし、その両方にまたがって組み込んでもよい。またビットストリーム中に第２の符号化信号を格納するために、ヘッダにおいても第１の符号化信号においても、連続した領域を確保しなくてもよい。すなわち、図４（ｃ）のように、ヘッダ情報と第１の符号化情報との中に、非連続に第２の符号化信号を組み込んでもよい。
【００５４】
図５は、図１に示したストリーム出力部１４０が補助情報を格納する場合の他の例を示す図である。図５（ａ）は、第１の符号化信号のみがフレームごとに連続して格納されているストリーム１を示している。図５（ｂ）は、補助情報が符号化された第２の符号化信号のみが、ストリーム１に対応するフレームごとに連続して格納されているストリーム２を示している。
【００５５】
ストリーム出力部１４０は、第２の符号化信号を、第１の符号化信号を格納したビットストリームであるストリーム１とは全く別のストリーム２に格納してもよい。例えば、ストリーム１とストリーム２とは、異なるチャンネルで伝送されるビットストリームである。
【００５６】
このように、第１の符号化信号と第２の符号化信号をまったく異なるビットストリームで伝送することにより、入力音響信号の基本的な情報を表す低域部分をあらかじめ伝送又は蓄積しておき、必要に応じて高域部情報を後から付加することができるという効果がある。
【００５７】
以上のように構成された符号化装置１００及び復号化装置２００の動作について、以下、図６、図７、図９、図１１、図１３、図１５、図１７及び図１９〜図２１のフローチャートを用いて説明する。
【００５８】
図６は、図１に示した第１の量子化部のスケールファクター決定処理における動作を示すフローチャートである。第１の量子化部１３１は、まず、スケールファクターの初期値として、各スケールファクターバンドに共通のスケールファクターを定め（Ｓ９１）、そのスケールファクターを用いて１フレーム分の音響データとして伝送されるべき低域部スペクトルデータをすべて量子化するとともに、求められたスケールファクターの前後の差分を求め、その差分と先頭のスケールファクターと各量子化値とをハフマン符号化する（Ｓ９２）。なお、ここでの量子化及び符号化は、ビット数のカウントのためだけに行うので、処理を簡略化するため、データのみについて行い、ヘッダなどの情報は付加しないものとする。次いで、第１の量子化部１３１は、ハフマン符号化後のデータのビット数が所定のビット数を超えたか否かを判断し（Ｓ９３）、超えていれば、スケールファクターの初期値を下げ（Ｓ１０１）、そのスケールファクターの値を用いて、同じ低域部スペクトルデータにつき、量子化とハフマン符号化とをやり直した上（Ｓ９２）、ハフマン符号化後の１フレーム分の低域部符号化データのビット数が所定のビット数を超えたか否かを判断して（Ｓ９３）、所定ビット数以下になるまでこの処理を繰り返す。
【００５９】
第１の量子化部１３１は、低域部符号化データのビット数が所定のビット数を超えていなければ、スケールファクターバンドごとに以下の処理を繰り返し、各スケールファクターバンドのスケールファクターを決定する（Ｓ９４）。まず、スケールファクターバンド内の各量子化値を逆量子化し（Ｓ９５）、それぞれの逆量子化値とそれに対応する元のスペクトルデータとの各絶対値の差分を求めて合計する（Ｓ９６）。さらに、求められた差分の合計が許容範囲内の値であるか否かを判断し（Ｓ９７）、許容範囲内であれば、次のスケールファクターバンドにつき、上記の処理を繰り返す（Ｓ９４〜Ｓ９８）。一方、許容範囲を超えていれば、スケールファクターの値を大きくして当該スケールファクターバンドのスペクトルデータを量子化するとともに（Ｓ１００）、その量子化値を逆量子化して（Ｓ９５）、逆量子化値と対応するスペクトルデータとの絶対値の差分を合計する（Ｓ９６）。さらに、差分の合計が許容範囲内かどうかを判断して（Ｓ９７）許容範囲を超えていれば、許容範囲内となるまでスケールファクターを順次大きくし（Ｓ１００）、上記の処理（Ｓ９５〜Ｓ９７及びＳ１００）を繰り返す。
【００６０】
第１の量子化部１３１は、すべてのスケールファクターバンドにつき、スケールファクターバンド内の量子化値を逆量子化した値と元のスペクトルデータとの絶対値の差分の合計が許容範囲となるようなスケールファクターを決定すると（Ｓ９８）、決定されたスケールファクターを用いて、再度、１フレーム分の低域部スペクトルデータを量子化し、各スケールファクターの差分と先頭のスケールファクターと各量子化値とをハフマン符号化し、低域部符号化データのビット数が所定のビット数を超えているか否かを判定する（Ｓ９９）。低域部符号化データのビット数が所定のビット数を超えていれば、それが所定のビット数以下になるまでスケールファクターの初期値を下げた後（Ｓ１０１）、各スケールファクターバンド内のスケールファクターを決定する処理（Ｓ９４〜Ｓ９８）を繰り返す。低域部符号化データのビット数が所定のビット数を超えていなければ（Ｓ９９）、そのときの各スケールファクターの値を、各スケールファクターバンドのスケールファクターに決定する。
【００６１】
なお、スケールファクターバンド内の量子化値を逆量子化した値と元のスペクトルデータとの絶対値の差分の合計が許容範囲となるかどうかの判断は、聴覚心理モデルなどのデータに基づいて行われる。
【００６２】
また、ここではスケールファクターの初期値を比較的大きな数値に設定し、ハフマン符号化後の低域部符号化データのビット数が、所定のビット数を超えた場合には、順次、スケールファクターの初期値を下げていく方法でスケールファクターを決定しているが、必ずしもこのようにする必要はない。例えば、あらかじめスケールファクターの初期値を低い値に設定しておき、その初期値を徐々に増加していき、低域部符号化データの全体のビット数が所定のビット数を最初に超えた段階で、直前に設定されていたスケールファクターの初期値を用いて各スケールファクターバンドのスケールファクターを決定するようにしてもよい。
【００６３】
さらに、ここでは１フレーム分の低域部符号化データ全体のビット数が所定のビット数を超えないように各スケールファクターバンドのスケールファクターを決定したが、必ずしもこのようにしなくてよい。例えば、各スケールファクターバンドにおいて、スケールファクターバンド内の各量子化値が所定のビット数を超えないようスケールファクターを決定するようにしてもよい。以下に、図７を用いて、この処理における第１の量子化部１３１の動作を説明する。
【００６４】
図７は、図１に示した第１の量子化部１３１の他のスケールファクター決定処理における動作を示すフローチャートである。第１の量子化部１３１は、符号化の対象となる低域部のすべてのスケールファクターバンドについて、以下の手順によりスケールファクターの計算を行う（Ｓ１）。また、第１の量子化部１３１は、各スケールファクターバンド内のすべてのスペクトルデータにつき、以下の手順によりスケールファクターの計算を行う（Ｓ２）。
【００６５】
まず、第１の量子化部１３１は、所定のスケールファクターの値で、スペクトルデータを公式に基づいて量子化し（Ｓ３）、その量子化値が量子化値を表すために与えられる所定のビット数、例えば、４ビットを超えたか否かを判定する（Ｓ４）。
【００６６】
判定の結果、量子化値が４ビットを超えている場合、スケールファクターの値を調整し（Ｓ８）、調整後のスケールファクターの値で同じスペクトルデータを量子化する（Ｓ３）。第１の量子化部１３１は、得られた量子化値が４ビットを超えたか否かを判定し（Ｓ４）、そのスペクトルデータの量化値が４ビット以下の値になるまで、スケールファクターの調整（Ｓ８）と調整後のスケールファクターによる量子化（Ｓ３）とを繰り返す。
【００６７】
判定の結果、量子化値が４ビット以下である場合、次のスペクトルデータについて、所定のスケールファクターの値で、量子化を行う（Ｓ３）。
第１の量子化部１３１は、１つのスケールファクターバンド内のすべてのスペクトルデータの量子化値が４ビット以下となると（Ｓ５）、そのときのスケールファクターの値を、そのスケールファクターバンドのスケールファクターに決定する（Ｓ６）。
【００６８】
さらに、第１の量子化部１３１は、すべてのスケールファクターバンドにつき、スケールファクターを決定すると（Ｓ７）、処理を終了する。
以上の処理により、符号化の対象となる低域部のすべてのスケールファクターバンドにつき、それぞれ１つのスケールファクターが決定される。第１の量子化部１３１は、このように決定されたスケールファクターを用いて、低域部のスペクトルデータを量子化し、量子化結果である４ビットの量子化値と、８ビットの前記スケールファクターとを第１の符号化部１３２に出力する。
【００６９】
図８は、図１に示した第２の量子化部１３３によって生成される補助情報（スケールファクター）の具体例を示すスペクトル波形図である。なお、図８において、低域部の周波数軸上に示す区切りは、それぞれ本実施の形態において定めたスケールファクターバンドの区切りを示している。また、高域部において周波数方向に破線で示す区切りは、本実施の形態において定めた高域部のスケールファクターバンドの区切りを示している。以下の波形図においても同様である。
【００７０】
変換部１２０から出力されるスペクトルデータのうち、図８に実線の波形で示す再生帯域１１．０２５ｋＨｚ以下の低域部は、第１の量子化部１３１に出力され、従来どおり量子化される。一方、図８に破線の波形で示す再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部は、第２の量子化部１３３によって計算される補助情報（スケールファクター）によって表される。以下、図８の具体例を用い、図９のフローチャートに従って第２の量子化部１３３の補助情報（スケールファクター）の計算手順を説明する。
【００７１】
図９は、図１に示した第２の量子化部１３３の補助情報（スケールファクター）計算処理における動作を示すフローチャートである。
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき、各スケールファクターバンドにおける絶対最大スペクトルデータの量子化値を「１」にする最適なスケールファクターを、以下の手順に従って計算する（Ｓ１１）。
【００７２】
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える高域部の最初のスケールファクターバンドにおける絶対最大スペクトルデータ（ピーク）を特定する（Ｓ１２）。図８の具体例において、最初のスケールファクターバンド内で特定されたピークの位置が▲１▼で、そのときのピークの値が「２５６」であったとする。
【００７３】
第２の量子化部１３３は、図７のフローチャートに示した手順と同様にして、量子化値を計算する公式にピークの値「２５６」と初期値のスケールファクター値とをあてはめ、公式から得られる量子化値が「１」となるスケールファクターｓｆの値を計算する（Ｓ１３）。例えば、この場合、ピーク値「２５６」の量子化値を「１」にするスケールファクターｓｆの値、例えばｓｆ＝２４が算出される。
【００７４】
最初のスケールファクターバンドについて、ピークの量子化値を「１」にするスケールファクターの値ｓｆ＝２４が求められると（Ｓ１４）、第２の量子化部１３３は、次のスケールファクターバンドについて、スペクトルデータのピークを特定し（Ｓ１２）、例えば、特定されたピークの位置が▲２▼で、その値が「３１２」であった場合、ピーク値「３１２」の量子化値が「１」となるスケールファクターｓｆの値、例えばｓｆ＝３２を計算する（Ｓ１３）。
【００７５】
同様にして、第２の量子化部１３３は、高域部における３番目のスケールファクターバンドについて、ピーク▲３▼の値「２８８」の量子化値を「１」にするスケールファクターｓｆの値、例えばｓｆ＝２６を計算し、４番目のスケールファクターバンドについて、ピーク▲４▼の値「２０３」の量子化値を「１」にするスケールファクターｓｆの値、例えばｓｆ＝１８を計算する。
【００７６】
このようにして、高域部のすべてのスケールファクターバンドについて、ピーク値の量子化値を「１」にするスケールファクターが計算されると（Ｓ１４）、第２の量子化部１３３は、計算によって得られた各スケールファクターバンドのスケールファクターを、高域部の補助情報として第２の符号化部１３４に出力し、処理を終了する。
【００７７】
以上のようにして第２の量子化部１３３によって補助情報（スケールファクター）が生成されるが、この補助情報（スケールファクター）は、５１２点のスペクトルデータで表されていた高域部を、各スケールファクターの値を０〜２５５までの値で表せば、高域部における各スケールファクターバンド（ここでは４つ）につき、それぞれ８ビットで表すことができる。また、この各スケールファクターの差分をハフマン符号化するようにすれば、データ量をさらに低減できる可能性がある。これに対し、この高域部の５１２点のスペクトルデータを低域部と同様に従来の方法で量子化及びハフマン符号化したとすると、最低でも１５０ビット程度のデータ量になると予測される。従って、この補助情報は、高域部の各スケールファクターバンドにつき１つのスケールファクターを示しているに過ぎないが、従来の方法に従って高域部を量子化する場合に比べて、データ量が大きく低減されていることがわかる。
【００７８】
また、このスケールファクターは、各スケールファクターバンドにおけるピーク値（絶対値）にほぼ比例した値を示しており、高域部における５１２点で一定値をとるスペクトルデータあるいは低域部のスペクトルデータの一部または全部のコピーにスケールファクターを乗算して得られるスペクトルデータは、入力音響信号に基づいて得られたスペクトルデータを大まかに復元しているといえる。また、スケールファクターバンド毎に、バンド内にコピーされたスペクトルデータの絶対最大値と、そのバンドに対応するスケールファクター値を用いて量子化値「１」を逆量子化した値との比率を係数として、バンド内の各スペクトルデータに乗じることにより、より精度良くスペクトルデータを復元することができる。さらに、高域部の波形の相違は、低域部ほど聴覚的にはっきり識別されるものではないので、このようにして得られた補助情報は、高域部の波形を表す情報として十分であるといえる。
【００７９】
なお、ここでは、高域部の各スケールファクターバンド内のスペクトルデータの量子化値が「１」となるようスケールファクターを計算したが、必ずしも「１」である必要はなく、他の値に定めておいてもよい。
【００８０】
またここでは、補助情報としてスケールファクターのみを符号化したが、これに限ったものでなく、量子化値、特徴的なスペクトルの位置情報、スペクトルの正負の符号を表すサイン情報及びノイズ生成方法等を併せて符号化してもよい。またこれらを２つ以上組み合わせて符号化してもよい。この場合、補助情報内に、振幅の比率を表す係数や絶対最大スペクトルデータの位置などを前記スケールファクターと組み合わせて符号化すれば、特に有効である。
【００８１】
図１０は、図１に示した第２の量子化部１３３によって生成される補助情報（量子化値）の具体例を示すスペクトル波形図である。また、図１１は、図１に示した第２の量子化部１３３の補助情報（量子化値）計算処理における動作を示すフローチャートである。
【００８２】
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき共通のスケールファクター値、例えば「１８」をあらかじめ定めておき、そのスケールファクター値「１８」を用いて、スケールファクターバンドごとに、そのスケールファクターバンドにおける絶対最大値スペクトルデータ（ピーク）の量子化値を計算する（Ｓ２１）。
【００８３】
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える高域部の最初のスケールファクターバンドにおける絶対最大スペクトルデータ（ピーク）を特定する（Ｓ２２）。図１０の具体例において、最初のスケールファクターバンド内で特定されたピークの位置が▲１▼で、そのときのピークの値が「２５６」であったとする。
【００８４】
第２の量子化部１３３は、量子化値を計算する公式に、あらかじめ定めた共通のスケールファクター値「１８」とピークの値「２５６」とをあてはめ、量子化値を計算する（Ｓ２３）。例えば、この場合、ピーク値「２５６」をスケールファクター値「１８」で量子化すると、量子化値「６」が算出される。
【００８５】
最初のスケールファクターバンドについて、ピーク値「２５６」の量子化値「６」が求められると（Ｓ２４）、第２の量子化部１３３は、次のスケールファクターバンドについて、スペクトルデータのピークを特定し（Ｓ２２）、例えば、特定されたピークの位置が▲２▼で、その値が「３１２」であった場合、スケールファクターの値を「１８」とするピーク値「３１２」の量子化値、例えば「１０」を計算する（Ｓ２３）。
【００８６】
同様にして、第２の量子化部１３３は、高域部における３番目のスケールファクターバンドについて、スケールファクターの値を「１８」とするピーク▲３▼の値「２８８」の量子化値「９」を計算し、４番目のスケールファクターバンドについて、スケールファクターの値を「１８」とするピーク▲４▼の値「２０３」の量子化値「５」を計算する。
【００８７】
このようにして、高域部のすべてのスケールファクターバンドについて、スケールファクターを「１８」に固定した場合のピーク値の量子化値が計算されると（Ｓ２４）、第２の量子化部１３３は、計算によって得られた各スケールファクターバンドの量子化値を、高域部の補助情報として第２の符号化部１３４に出力し、処理を終了する。
【００８８】
以上のようにして第２の量子化部１３３によって補助情報（量子化値）が生成されるが、この補助情報（量子化値）は、５１２点のスペクトルデータで表されていた高域部を、４つのスケールファクターバンドにつき、それぞれ４ビットの量子化値で表している。これに対し、前述の補助情報（スケールファクター）では、高域部を、４つのスケールファクターバンドにつき、それぞれ８ビットのスケールファクターで表していたので、これと比較すると、高域部のデータ量がより低減されている。また、この量子化値は、各スケールファクターバンドにおけるピーク値（絶対値）の振幅を大まかに表しており、高域部における５１２点で一定値をとるスペクトルデータあるいは低域部のスペクトルデータの一部または全部のコピーに、これを単純に乗算して得られるスペクトルデータであっても、入力音響信号に基づいて得られたスペクトルデータを大まかに復元しているといえる。また、スケールファクターバンド毎に、バンド内にコピーされたスペクトルデータの絶対最大値と、あらかじめ定められていたスケールファクター値を用いてそのバンドに対応する量子化値を逆量子化した値との比率を係数として、バンド内の各スペクトルデータに乗じることにより、さらに精度良くスペクトルデータを復元することができる。
【００８９】
なおここでは、第２の符号化情報として伝送される量子化値に対応するスケールファクター値は、あらかじめ定めたものにしたが、最適なスケールファクター値を計算し、第２の符号化情報に付加して伝送してもよい。例えば、量子化値の最大値が７となるようにスケールファクターを選択すれば、量子化値を表すビット数が３ビットですむので、量子化値の伝送に必要な情報量はより少なくて済む。
【００９０】
なお、補助情報として量子化値のみ、または量子化値とスケールファクターのみを符号化したが、これに限ったものでなくてよく、スケールファクター、特徴的なスペクトルの位置情報、スペクトルデータのサイン情報及びノイズ生成方法等を符号化してもよい。またこれらを２つ以上組み合わせて符号化してもよい。
【００９１】
図１２は、図１に示した第２の量子化部１３３によって生成される補助情報（位置情報）の具体例を示すスペクトル波形図である。また、図１３は、図１に示した第２の量子化部１３３の補助情報（位置情報）計算処理における動作を示すフローチャートである。
【００９２】
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき、各スケールファクターバンドにおける絶対最大スペクトルデータの位置を以下の手順に従って特定する（Ｓ３１）。
【００９３】
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える高域部の最初のスケールファクターバンドにおける絶対最大スペクトルデータ（ピーク）を特定する（Ｓ３２）。図１２の具体例において、最初のスケールファクターバンド内で特定されたピークの位置が▲１▼で、このスケールファクターバンドの先頭から２２番目のスペクトルデータであったとする。第２の量子化部１３３は、特定されたピークの位置「スケールファクターバンドの先頭から２２番目のスペクトルデータ」を保持する（Ｓ３３）。
【００９４】
最初のスケールファクターバンドについて、ピークの位置が特定され、保持されると（Ｓ３４）、第２の量子化部１３３は、次のスケールファクターバンドについて、スペクトルデータのピークを特定する（Ｓ３２）。例えば、特定されたピークの位置が▲２▼で、スケールファクターバンドの先頭から６０番目のスペクトルデータであったとする。第２の量子化部１３３は、特定されたピークの位置「スケールファクターバンドの先頭から６０番目のスペクトルデータ」を保持する（Ｓ３３）。
【００９５】
以下同様にして、第２の量子化部１３３は、高域部における３番目のスケールファクターバンドについて、ピーク▲３▼の位置「スケールファクターバンドの先頭のスペクトルデータ」を特定して保持するとともに、４番目のスケールファクターバンドについて、ピーク▲４▼の位置「スケールファクターバンドの先頭から２５番目のスペクトルデータ」を特定して保持する。
【００９６】
このようにして、高域部のすべてのスケールファクターバンドについて、ピークの位置が特定され、保持されると（Ｓ３４）、第２の量子化部１３３は、保持していた各スケールファクターバンドのピークの位置を、高域部の補助情報として第２の符号化部１３４に出力し、処理を終了する。
【００９７】
以上のようにして第２の量子化部１３３によって補助情報（位置情報）が生成されるが、この補助情報（位置情報）は、５１２点のスペクトルデータで表されていた高域部を、４つのスケールファクターバンドにつき、それぞれ６ビットの位置情報で表している。
【００９８】
この場合、復号化装置２００において、第２の逆量子化部２２４は、低域部の５１２サンプル分のスペクトルデータの一部または全部を、第２の復号化部２２３から入力された補助情報（位置情報）に応じて高域部側の５１２サンプルデータとしてコピーする。
コピーの手順は、１つ以上のスケールファクターバンドにおけるスペクトルデータのピーク情報を元に、類似したデータを第１の逆量子化部２２２より出力されたスペクトルデータより抽出し、その一部又は全部をコピーすることで達成される。
【００９９】
また第２の逆量子化部２２４においては、必要に応じてコピーしたスペクトルデータの振幅を調整する。振幅の調整は各スペクトルデータにあらかじめ定められた係数、例えば「０．５」として、この係数を乗じることで達成する。この係数は固定値でもよいし、帯域ごと、あるいはスケールファクターバンドごとに変更してもよいし、第１の逆量子化部２２２より出力されるスペクトルデータに応じて変更してもよい。
【０１００】
また、上記ではあらかじめ定めた係数を用いるが、補助情報として、この係数の値を第２の符号化情報内に付加してもよい。または係数としてスケールファクター値を第２の符号化情報に付加してもよいし、係数としてスケールファクターバンド内のピークの量子化値を第２の符号化情報に付加してもよい。また振幅調整方法はこれに限ったものではなく、他の方法を用いてもよい。
【０１０１】
なおここでは、補助情報として位置情報のみ、または位置情報と係数情報のみを符号化したが、これに限ったものでなくてよく、スケールファクター、量子化値、スペクトルのサイン情報及びノイズ生成方法等を符号化してもよい。またこれらを２つ以上組み合わせて符号化してもよい。
また、低域部側のスペクトルデータを高域部側のスペクトルデータとしてコピーしているが、これに限らず、高域部側のスペクトルデータは第２の符号化情報のみから生成してもよい。
【０１０２】
図１４は、図１に示した第２の量子化部１３３によって生成される補助情報（サイン情報）の具体例を示すスペクトル波形図である。また、図１５は、図１に示した第２の量子化部１３３の補助情報（サイン情報）計算処理における動作を示すフローチャートである。
【０１０３】
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき、各スケールファクターバンドのあらかじめ定めた位置、例えばスケールファクターバンド中央におけるスペクトルデータのサイン情報を以下の手順に従って特定する（Ｓ４１）。
【０１０４】
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える高域部の最初のスケールファクターバンドの中央位置におけるスペクトルデータのサイン情報を調べ（Ｓ４２）、その値を保持する。例えば、最初のスケールファクターバンドの中央位置におけるスペクトルデータのサイン符号は、「＋」である。第２の量子化部１３３は、この符号「＋」を１ビットの値「１」で表して保持する。また、この符号が「−」であった場合は、「０」で表して保持する。
【０１０５】
最初のスケールファクターバンドについて、スケールファクターバンドの中央位置におけるスペクトルデータのサイン情報が保持されると（Ｓ４３）、第２の量子化部１３３は、次のスケールファクターバンドについて、中央位置におけるスペクトルデータの符号を調べる（Ｓ４２）。例えば、調べられた符号が「＋」であったとすると、第２の量子化部１３３は、２番めのスケールファクターバンドの中央位置におけるスペクトルデータのサイン情報として「１」を保持する。
【０１０６】
以下同様にして、第２の量子化部１３３は、高域部における３番目のスケールファクターバンド中央位置におけるスペクトルデータの符号「＋」を調べ、そのサイン情報「１」を保持するとともに、４番目のスケールファクターバンド中央位置におけるスペクトルデータの符号「＋」を調べ、そのサイン情報「１」を保持する。
【０１０７】
このようにして、高域部のすべてのスケールファクターバンドについて、中央位置のスペクトルデータのサイン情報が保持されると（Ｓ４３）、第２の量子化部１３３は、保持していた各スケールファクターバンドのサイン情報を、高域部の補助情報として第２の符号化部１３４に出力し、処理を終了する。
【０１０８】
以上のようにして第２の量子化部１３３によって補助情報（サイン情報）が生成されるが、この補助情報（サイン情報）は、５１２点のスペクトルデータで表されていた高域部を、４つのスケールファクターバンドにつき、それぞれ１ビットのサイン情報で表しており、非常に短いデータ長で高域部のスペクトルを表すことができる。
【０１０９】
この場合、復号化装置２００において、第２の逆量子化部２２４は、低域部の５１２サンプル分のスペクトルデータの一部または全部を高域部側スペクトルとしてコピーし、第２の復号化部２２３から入力されたサイン情報に応じて、あらかじめ定められた位置のスペクトルデータの符号を決定する。
【０１１０】
なお、ここでは、高域部の各スケールファクターバンド中央位置の符号を表したサイン情報を補助情報（サイン情報）としたが、スケールファクターバンド中央の位置に限定されず、例えば、各ピーク位置のサイン情報であっても良いし、スケールファクターバンド先頭のサイン情報であっても良いし、それ以外の所定の位置でもよい。
【０１１１】
またここでは、伝送する符号（サイン情報）に対応するスペクトルデータの位置はあらかじめ定めたものになっているが、これは第１の逆量子化部２２２の出力に応じて変更してもよいし、各スケールファクターバンドのサイン情報がどの位置のサイン情報であるかを示す位置情報を、第２の符号化情報に付加して伝送してもよい。
【０１１２】
また第２の逆量子化部２２４においては、必要に応じてコピーしたスペクトルデータの振幅を調整する。振幅の調整は、各スペクトルデータにあらかじめ決められた係数、例えばその値を「０．５」として、その係数を乗じることで達成できる。
この係数は固定値でもよいし、帯域ごとに、あるいはスケールファクターバンドごとに変更してもよいし、第１の逆量子化部２２２より出力されるスペクトルデータに応じて変更してもよい。また振幅調整方法はこれに限ったものではなく、他の方法を用いてもよい。
【０１１３】
なおここでは、あらかじめ定めた係数を用いたが、この係数の値を補助情報として第２の符号化情報に付加してもよい。また、その係数としてスケールファクター値を第２の符号化情報に付加してもよいし、係数として量子化値を第２の符号化情報に付加してもよい。
【０１１４】
さらにここでは、補助情報としてサイン情報のみ、またはサイン情報と係数情報とのみ、またはサイン情報と位置情報とのみ、またはサイン情報と位置情報と係数情報とのみを符号化したが、これに限ったものでなく、量子化値、スケールファクター、特徴的なスペクトルの位置情報、及びノイズ生成方法等を符号化してもよい。またこれらを２つ以上組み合わせて符号化してもよい。
【０１１５】
なお本実施の形態においては、低域部側のスペクトルデータを高域部側のスペクトルデータとしてコピーしているが、これに限らず、高域部側のスペクトルデータは第２の符号化情報のみから生成してもよい。
なお、上記では、この符号「＋」を１ビットの値「１」で表し、符号「−」をで「０」で表したが、補助情報（サイン情報）における符号の表し方は、これに限定されず、他の値で表してもよい。
【０１１６】
図１６は、図１に示した第２の量子化部１３３によって生成される補助情報（コピー情報）の作成方法の一例を示すスペクトル波形図である。図１６（ａ）は、高域部の最初のスケールファクターバンドにおけるスペクトルを示す波形図である。図１６（ｂ）は、補助情報（コピー情報）によって特定される低域部のスペクトル波形の一例を示す波形図である。また、図１７は、図１に示した第２の量子化部１３３の補助情報（コピー情報）計算処理における動作を示すフローチャートである。
【０１１７】
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき、そのスケールファクターバンド先頭からのピークの位置ｎ（先頭からｎ番目）に対し、低域部においてスケールファクターバンド先頭からのピークの位置がｎに最も近い値となるスケールファクターバンドの番号Ｎを、以下の手順に従って特定する（Ｓ５１）。
【０１１８】
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える高域部の最初のスケールファクターバンドにおける絶対最大スペクトルデータ（ピーク）の位置ｎを特定する（Ｓ５２）。その結果、例えば、図１６（ａ）に示すように、特定されたピークの位置が▲１▼で、そのスペクトルがこのスケールファクターバンドのｎ＝２２のスペクトルデータであったとする。
【０１１９】
第２の量子化部１３３は、スペクトルの周波数が再生帯域１１．０２５ｋＨｚ以下の低域部におけるスペクトルのすべての（正負の両方を含む）ピークの位置を特定する（Ｓ５３）。
次いで、第２の量子化部１３３は、低域部で特定されたすべてのピークについて、ピークからスケールファクターバンドの先頭までの位置がｎに最も近いスケールファクターバンドをサーチし、そのスケールファクターバンドの番号Ｎと、そのサーチの方向とピークのサイン情報とを特定する（Ｓ５４）。
【０１２０】
具体的には、第２の量子化部１３３は、特定された（正負の両方を含む）全ピークにつき、低周波側のピークから順次、そのピークからの位置がｎに最も近いスケールファクターバンドの先頭をサーチする。サーチの方向は、ピークからさらに低周波の方向に向かってサーチする場合（１）と、ピークからさらに高周波の方向に向かってサーチする場合（２）との２通りがある。また、高域部のピークと正負の符号が反転している低域部のピークについても、ピークからさらに低周波の方向に向かってサーチする場合（３）と、ピークからさらに高周波の方向に向かってサーチする場合（４）との２通りがある。
【０１２１】
これらのうち、サーチ方向が（２）と（４）の場合には、このピーク情報に基づいて低域部のスペクトル波形をコピーした場合には、図１６（ｂ）に示すように高域部のピークの位置と低域部のピークの位置とがスケールファクターバンド内で左右（周波数軸方向）に反転した波形がコピーされるため、例えば（１）と（３）とのサーチ方向を順方向とし、（２）と（４）とを逆方向として、サーチ方向の順逆を表す情報を添付することが必要である。また、サーチ方向が（３）と（４）との場合は、図１６（ｂ）に示すように高域部のピークの位置と低域部のピークの位置とが上下（縦軸方向）に反転した波形がコピーされるため、高域部のピーク値と低域部のピーク値との正負の符号が反転しているか否かを示す情報を添付することが必要である。
【０１２２】
第２の量子化部１３３は、低域部で特定されたピークが正の値をとるピークであれば（１）と（２）とのサーチ方向で、低域部で特定されたピークが負の値をとるピークであれば（３）と（４）との合わせて４通りの方向についてサーチを行い、そのサーチ結果のうち、ピークからの位置がｎに最も近いスケールファクターバンドの番号を特定する。この場合、あらかじめｎとの誤差範囲を所定の値、例えば「５」に設定しておき、前記４通りのサーチ結果のうちから、ピークからの位置がｎに最も近いスケールファクターバンドを選択して、そのスケールファクターバンドの番号Ｎを特定する。併せて、高域部のピーク値と低域部のピーク値との正負の符号が反転しているか否かを示すサイン情報と、サーチ方向の順逆を表す情報とを特定する。
【０１２３】
例えば、サーチ方向（１）では、図１７（ｂ）の（１）に示すような低域部のスペクトルに対応して、ピークからの位置の誤差「１」で、スケールファクターバンドの番号Ｎ＝３が特定されたとする。また、サーチ方向（２）では、図１７（ｂ）の（２）に示すような低域部のスペクトルに対応して、ピークからの位置の誤差「５」で、スケールファクターバンドの番号Ｎ＝１８が特定されたとし、同様に、サーチ方向（３）では、図１７（ｂ）の（３）に示すような低域部のスペクトルに対応して、誤差「４」で、スケールファクターバンドの番号Ｎ＝１２、サーチ方向（４）では、図１７（ｂ）の（４）に示すような低域部のスペクトルに対応して、誤差「２」で、スケールファクターバンドの番号Ｎ＝１０が特定されたとする。第２の量子化部１３３は、特定されたスケールファクターバンドの番号４つのうち、ピークからの位置の誤差が「１」で、ピークからの位置がｎに最も近いスケールファクターバンドの番号Ｎ＝３を選択する。これと併せて、低域部のピークの符号「＋」を表すサイン情報「１」と、ピークからさらに低周波の方向に向かってサーチしたことを表すサーチ方向情報「１」とを生成する。この場合、ピークの符号が「−」であればサイン情報を「０」とし、ピークからさらに高周波の方向に向かってサーチした場合は、サーチ方向情報を「０」として表す。
【０１２４】
高域部の最初のスケールファクターバンドについて、スケールファクターバンドの番号Ｎ＝３とサイン情報「１」とサーチ方向情報「１」とが特定されると（Ｓ５５）、第２の量子化部１３３は、上記と同様にして次のスケールファクターバンドについて、スケールファクターバンドの番号Ｎとそのサイン情報とそのサーチ方向情報とを特定する。
【０１２５】
このようにして、高域部のすべてのスケールファクターバンドについて、そのスケールファクターバンドにおける先頭からのピークの位置ｎに対し、スケールファクターバンド先頭からのピークの位置がｎに最も近い値となる低域部のスケールファクターバンドの番号Ｎとそのサイン情報とそのサーチ方向情報とが特定されると（Ｓ５５）、第２の量子化部１３３は、特定された高域部の各スケールファクターバンドに対応する低域部のスケールファクターバンドの番号Ｎとサイン情報とサーチ方向情報とを高域部の補助情報（コピー情報）として第２の符号化部１３４に出力し、処理を終了する。
【０１２６】
この場合、復号化装置２００において、第１の符号化信号を従来の手順に従って復号化すると、低域部側の５１２サンプルのスペクトルデータが得られる。第２の逆量子化部２２４では、第２の復号化部２２３から出力されたスケールファクターバンド番号に該当するスペクトルデータの一部または全部を高域部側スペクトルとしてコピーする。また第２の逆量子化部２２４においては、必要に応じてコピーしたスペクトルデータの振幅を調整する。振幅の調整は、各スペクトルにあらかじめ決められた係数、例えばその値を「０．５」として、その係数を乗じることで達成できる。
【０１２７】
この係数は固定値でもよいし、帯域ごと、スケールファクターバンドごとに変更してもよいし、第１の逆量子化部２２２より出力されるスペクトルデータに応じて変更してもよい。
【０１２８】
なおここでは、振幅の調整に、あらかじめ定めた係数を用いたが、この係数の値を補助情報として第２の符号化情報に付加してもよい。また係数としてスケールファクター値を第２の符号化情報に付加してもよいし、係数として量子化値を第２の符号化情報に付加してもよい。また振幅調整方法はこれに限ったものではなく、他の方法を用いてもよい。
【０１２９】
なお、ここでは、高域部の補助情報（コピー情報）としてスケールファクターバンドの番号Ｎのほかにそのサイン情報とサーチ方向情報とを抽出したが、高域部について伝送可能な情報量に応じて、サイン情報とサーチ方向情報とは省略してもよい。また、サイン情報は、低域部のピークの符号が「＋」であれば「１」、「−」であれば「０」とし、サーチ方向情報は、ピークからさらに低周波の方向に向かってサーチした場合は「１」、ピークからさらに高周波の方向に向かってサーチした場合は「０」として表したが、サイン情報における低域部のピークの符号及びサーチ方向情報のサーチ方向の表し方は、それぞれこれらに限定されず、他の値で表してもよい。
【０１３０】
また、ここでは、低域部において特定された各ピークの位置からその距離がｎに最も近い値となるスケールファクターバンドの先頭をサーチしたが、本発明はこの例に限定されず、低域部の各スケールファクターバンド先頭からその距離がｎに最も近い値となるピークをサーチしてもよい。
【０１３１】
図１８は、図１に示した第２の量子化部１３３によって生成される補助情報（コピー情報）の作成方法の第２の例を示すスペクトル波形図である。図１９は、図１に示した第２の量子化部１３３の補助情報（コピー情報）の第２の計算処理における動作を示すフローチャートである。
【０１３２】
第２の量子化部１３３は、再生帯域１１．０２５ｋＨｚを超える再生帯域２２．０５ｋＨｚまでの高域部のすべてのスケールファクターバンドにつき、そのスケールファクターバンド内の全スペクトルとのスペクトルの差分（エネルギー差）が最小となる低域部のスケールファクターバンドの番号Ｎを、以下の手順に従って特定する（Ｓ６１）。ただし、低域部において高域部との差分をとるスペクトルの個数は、高域部のスケールファクターバンド内のスペクトルの個数と等しくとり、特定されるスケールファクターバンドの番号Ｎは、そのスペクトルの先頭のスケールファクターバンドの番号とする。
【０１３３】
第２の量子化部１３３は、低域部のすべてのスケールファクターバンドにつき（Ｓ６２）、そのスケールファクターバンドの先頭から高域部のスケールファクターバンド内のスペクトルデータと同数のスペクトルデータからなる周波数の幅で、高域部のスペクトルと低域部のスペクトルとの差分を求める（Ｓ６３）。例えば、図１８に示す波形図において、高域部の最初のスケールファクターバンドが、スペクトルデータ数＝４８のスケールファクターバンドであったとすると、第２の量子化部１３３は、低域部の番号Ｎ＝１のスケールファクターバンドの先頭から４８個のスペクトルデータにつき、順次、高域部と低域部とのスペクトルの差分を求める。
【０１３４】
第２の量子化部１３３は、高域部のスケールファクターバンドと同数のスペクトルについて、高域部と低域部とのスペクトルの差分が求められると（Ｓ６５）、その値を保持し、次の低域部のスケールファクターバンドの先頭から、高域部のスケールファクターバンド内のスペクトルと同数のスペクトルデータの周波数の幅で、高域部スペクトルと低域部スペクトルとの差分を求める（Ｓ６４）。例えば、低域部の番号Ｎ＝１のスケールファクターバンドの先頭から４８個のスペクトルデータの幅で、スペクトルの差分が求められると、求められた差分の値を保持しておき、低域部の番号Ｎ＝２のスケールファクターバンドの先頭から４８個のスペクトルデータの幅で、スペクトルの差分を求める。以下同様に、低域部の番号Ｎ＝３のスケールファクターバンド、番号Ｎ＝４のスケールファクターバンド、・・・、番号Ｎ＝２８（低域部の最後）のスケールファクターバンドというように、低域部のすべてのスケールファクターバンドについて、順次、高域部と低域部との４８個のスペクトルデータ同士の差分を合計してスペクトルの差分を求める。
【０１３５】
低域部のすべてのスケールファクターバンドについて、そのスケールファクターバンドの先頭から、高域部のスケールファクターバンド内のスペクトルデータと同数のスペクトルデータの幅で、高域部スペクトルと低域部スペクトルとの差分が求められると（Ｓ６４）、第２の量子化部１３３は、求められた差分が最小となるスケールファクターバンドの番号Ｎを特定する（Ｓ６５）。例えば、図１９に示すスペクトル波形図において、低域部の番号Ｎ＝８のスケールファクターバンドが特定されたとする。このことは、低域部の斜線で示す部分のスペクトルは、高域部の斜線で示す部分のスペクトルとの差分が最も少なく、スペクトル同士のエネルギー差が最も小さいことを示している。すなわち、番号Ｎ＝８のスケールファクターバンドの先頭から４８個のスペクトルデータは、１１．０２５ｋＨｚから始まる高域部の最初のスケールファクターバンドにコピーした場合、図１９の高域部に一点鎖線で示す波形となり、オリジナルのスペクトルに対して近似的に、高域部の当該スケールファクターバンド内のエネルギーを表すことができる。
【０１３６】
第２の量子化部１３３は、高域部のスケールファクターバンドにつき、スペクトルの差分が最小となる低域部スケールファクターバンドの番号Ｎを特定すると、特定されたスケールファクターバンドの番号Ｎを保持し、上記と同様にして、次の高域部のスケールファクターバンドにつき、該当するスケールファクターバンドの番号Ｎを特定する（Ｓ６６）。以下、高域部の各スケールファクターバンドにつき、順次この処理を繰り返し、すべての高域部のスケールファクターバンドにいて、スペクトルの差分が最小となる低域部スケールファクターバンドの番号Ｎを特定すると、保持していた低域部のスケールファクターバンドの番号Ｎを、高域部の補助情報（コピー情報）として第２の符号化部１３４に出力し、処理を終了する。
【０１３７】
なお、この場合、復号化装置２００における低域側スペクトルのコピー方法及び振幅調整方法は、図１６と図１７とを用いて説明した補助情報（コピー情報）の場合と同様である。
【０１３８】
また、図１９のフローチャートでは高域部と低域部とのエネルギー差を計算する際に、同符号、かつ、周波数軸上の同方向に計算したが、本発明の符号化装置はこれに限定されず、図１６と図１７とを用いて説明したように、以下の３通りの方法のいずれかを用いて高域部と低域部とのエネルギー差を計算してもよい。▲１▼高域部と低域部との各スペクトルデータの値を、同符号で、かつ、低周波側から高周波側に向かって順次選択される高域部スペクトルデータに対し、低域部スケールファクターバンドの先頭から高域部と同数のスペクトルデータについて高周波側から低周波側に向かって（すなわち周波数軸上の逆方向に）スペクトルデータを順次選択し、差分を計算する。▲２▼低域部スペクトルの符号を反転し（マイナスをかけ）、かつ、周波数軸上の同方向に計算する。▲３▼低域部スペクトルの符号を反転し（マイナスをかけ）、かつ、周波数軸上の逆方向に計算する。また、これら４つのすべての方法で計算を行った後、これらのうちのエネルギー差が最小となる低域部スペクトルのスケールファクターバンドの番号Ｎを補助情報としてもよい。この場合には、エネルギー差が最小となる低域部スペクトルを高域部に正しくコピーするために、低域部スペクトルと高域部スペクトルとの符号の関係を示す情報と、高域部に低域部スペクトルをコピーする周波数軸上の方向を示す情報とを、スケールファクターバンドごとに補助情報に含める。低域部スペクトルと高域部スペクトルとの符号の関係を示す情報は、例えば、同符号で差分をとった場合を「１」、逆符号で差分をとった場合を「０」として１ビットで表される。また、低域部スペクトルを高域部にコピーする場合の周波数軸上の方向を示す情報は、例えば、順方向にコピーする場合、すなわち、高域部と低域部とにおいてスペクトルデータを選択する方向が順方向だった場合を「１」、逆方向にコピーする場合、すなわち、高域部と低域部とにおいてスペクトルデータを選択する方向が逆方向だった場合を「０」として１ビットで表される。
【０１３９】
図２０は、図１に示した第２の逆量子化部２２４によって低域部５１２スペクトルが順方向に高域部にコピーされる手順を示すフローチャートである。図２０において、ｉｎｖ＿ｓｐｅｃ１［ｉ］は、第１の逆量子化部２２２の出力データのうちのｉ番目のスペクトルの値を示し、ｉｎｖ＿ｓｐｅｃ２［ｊ］は、第２の逆量子化部２２４の入力データのうちのｊ番目のスペクトルの値を示している。
【０１４０】
まず、第２の逆量子化部２２４は、０番目のスペクトルから５１１番目のスペクトルまでを同方向に入力するため、スペクトルの数をカウントするカウンタｉ、ｊの初期値をそれぞれ「０」にセットする（Ｓ７１）。次いで、第２の逆量子化部２２４は、カウンタｉの値が「５１２」未満であるか否かを調べ（Ｓ７２）、カウンタｉの値が「５１２」未満であれば、第１の逆量子化部２２２の低域部ｉ番目（この場合、０番目）のスペクトルの値を、第２の逆量子化部２２４の高域部ｊ番目（この場合、０番目）のスペクトルの値として入力する（Ｓ７３）。この後、第２の逆量子化部２２４は、カウンタｉ、ｊの値をそれぞれ「１」だけインクリメントし（Ｓ７４）、カウンタｉの値が「５１２」未満であるか否かを調べる（Ｓ７２）。
【０１４１】
第２の逆量子化部２２４は、カウンタｉの値が「５１２」未満である間、上記処理を繰り返し、カウンタｉの値が「５１２」以上になると、処理を終了する。この結果、第１の逆量子化部２２２の逆量子化結果である０〜５１１番目の低域部の全スペクトルが、そのまま第２の逆量子化部２２４の高域部のスペクトルとしてコピーされる。
【０１４２】
図２１は、図１に示した第２の逆量子化部２２４によって低域部５１２スペクトルが周波数軸方向の逆方向に高域部にコピーされる手順を示すフローチャートである。図２０と同様、図２１において、ｉｎｖ＿ｓｐｅｃ１［ｉ］は、第１の逆量子化部２２２の出力データのうちのｉ番目のスペクトルの値を示し、ｉｎｖ＿ｓｐｅｃ２［ｊ］は、第２の逆量子化部２２４の入力データのうちのｊ番目のスペクトルの値を示している。
【０１４３】
まず、第２の逆量子化部２２４は、０番目のスペクトルから５１１番目のスペクトルまでを逆方向に入力するため、スペクトルの数をカウントするカウンタｉの初期値を「０」に、ｊの初期値を「５１１」にセットする（Ｓ８１）。次いで、第２の逆量子化部２２４は、カウンタｉの値が「５１２」未満であるか否かを調べ（Ｓ８２）、カウンタｉの値が「５１２」未満であれば、第１の逆量子化部２２２の低域部ｉ番目（この場合、０番目）のスペクトルの値を、第２の逆量子化部２２４の高域部ｊ番目（この場合、５１１番目）のスペクトルの値として入力する（Ｓ８３）。この後、第２の逆量子化部２２４は、カウンタｉの値を「１」だけインクリメントし、ｊの値を「１」だけデクリメントして（Ｓ８４）、カウンタｉの値が「５１２」未満であるか否かを調べる（Ｓ８２）。
【０１４４】
第２の逆量子化部２２４は、カウンタｉの値が「５１２」未満である間、上記処理を繰り返し、カウンタｉの値が「５１２」以上になると、処理を終了する。この結果、第１の逆量子化部２２２の逆量子化結果である０〜５１１番目の低域部の全スペクトルが、第２の逆量子化部２２４の高域部の５１１〜０番目のスペクトルとして逆方向にコピーされる。
【０１４５】
なおここでは、第２の逆量子化部２２４は低域部における全てのスペクトルデータを高域部にコピーしたが、一部のみコピーしてもよい。また高域部と低域部の全体を一度にコピーする手順として図２０及び図２１の場合を例として挙げたが、一部図２０のようにコピーし、一部図２１のようにコピーしてもよい。また、さらに、それらの一部、または全部を正負の符号を反転してコピーしてもよい。
【０１４６】
またこれらのコピー手順は、あらかじめ決めておいてもよいし、低域部のデータに応じて変更してもよいし、補助情報として伝送してもよい。
なおここでは、低域部側のスペクトルデータを高域部側のスペクトルデータとしてコピーしているが、これに限らず、高域部側のスペクトルデータは第２の符号化情報のみから生成してもよい。
【０１４７】
なお本実施の形態においては、全スペクトルデータのうち低域部側の５１２サンプルを第１の符号化信号として符号化し、残りを第２の符号化信号として符号化したがその配分はこれに限定されるものではない。
なお本実施の形態においては、第２の逆量子化部２２４におけるノイズ生成として、主として第１の逆量子化部２２２から得られるスペクトルデータをコピーする場合について説明したが、これに限ったものでなく、高域の各スケールファクターバンド内において一定値を持つスペクトルデータ、ホワイトノイズ、及びピンクノイズなどを、第２の逆量子化部２２４で独自に生成してもよいし、補助情報に応じて生成してもよい。
【０１４８】
なお本実施の形態においては、第２の符号化信号として、各スケールファクターバンドに１つの補助情報を符号化しているが、２つ以上のスケールファクターバンド毎に１つの補助情報を符号化してもよいし、１つのスケールファクターバンドに２つ以上の補助情報を符号化してもよい。
なお本実施の形態における補助情報は、チャンネル毎に補助情報を符号化してもよいし、２つ以上のチャンネルに対して１つの補助情報を符号化してもよい。
【０１４９】
なお本実施の形態においては、符号化装置１００における量子化部及び符号化部はそれぞれ２つとしたが、これに限定されるものではなく、３つ以上の量子化部及び復号化部を備えてもよい。
なお本実施の形態においては、復号化装置２００における復号化部及び逆量子化部はそれぞれ２つとしたが、これに限定されるものではなく、３つ以上の復号化部及び逆量子化部を備えてもよい。
【０１５０】
なお、本実施の形態においては、変換部１２０が、変換後のスペクトルデータを、独自に定めた区切り方及び数のスケールファクターバンドに分類した場合について説明したが、本発明の符号化装置はこれに限定されず、変換部は変換後のスペクトルデータをＭＰＥＧ−２ＡＡＣの規格に従ったスケールファクターバンドに分類してもよい。このように規格に従ったスケールファクターバンドに分類しておくことによって、従来の復号化装置４００においても、本発明の符号化装置１００によって符号化されたビットストリームを支障なく復号化して、従来どおりのデジタル音響出力データを得ることができる。
【０１５１】
以上の処理は、ハードウェアはもちろん、ソフトウェアでも実現でき、また、１部をハードウェア、残りをソフトウェアで実現するという構成でもよい。
なお、本実施の形態においては、サンプリング周波数を４４．１ｋＨｚとし、１フレームを１０２４サンプルのデジタル音響データとして説明したが、本発明の符号化装置及び復号化装置はこれに限定されず、サンプリング周波数は何Ｈｚであってもよい。
【０１５２】
本発明の符号化装置は、入力された音響信号を符号化する符号化装置であって、一定時間分の入力音響信号を変換して得られる複数のグループに分けられたスペクトルデータから、前記各グループ内のスペクトルデータを正規化する正規化係数と、前記正規化係数を用いて前記各グループ内の前記各スペクトルデータを量子化して得られる量子化値と、前記各スペクトルデータの正負を表す正または負の符号と、前記各スペクトルデータの周波数軸上の位置とを含む４種類の情報で表された周波数の低域部データを符号化する第１符号化手段と、周波数高域部の前記各グループにおける前記スペクトルデータに近似した低域部スペクトルデータを特定する情報と、特定された前記低域部スペクトルデータを整形するための情報として、高域部スペクトルデータの特徴を、前記４種類の情報のうち、１種類以上３種類以下の情報で表した整形のための情報とを含む補助情報を生成する補助情報生成手段と、生成された前記補助情報を符号化する第２符号化手段と、前記第１符号化手段によって符号化されたデータと、前記第２符号化手段によって符号化されたデータとを出力する出力手段とを備えることを特徴とする。本発明の上記符号化装置において、補助情報生成手段は、一定時間分の入力音響信号を変換して得られる複数のグループに分けられたスペクトルデータのうち、周波数の高域部の特徴を、低域部より少ない情報で表した補助情報を生成し、第２符号化手段は、生成された前記補助情報を符号化する。
【０１５３】
従って、本発明の符号化装置によれば、高域部のスペクトルデータをそのまま量子化及び符号化するのではなく、周波数の高域部の特徴を、低域部より少ないパラメータで表した補助情報を符号化するので、低域部と比べて非常に少ないデータ量で周波数の高域部のスペクトルを符号化することができるという効果がある。また、従来のＭＰＥＧ−２ＡＡＣでは、全帯域の音響信号の符号化を低域部と高域部とで同じ方式で行っていたため、低転送レートでの高域部の伝送は困難であったが、本発明の符号化装置によれば、符号化後の情報量を大幅に増加させることなく高域部の情報を伝送することができるので、これを復号する復号化装置では、従来の復号化装置よりも高域部の豊かな高音質な音響信号を復号化することができるという効果がある。
【０１５４】
また、本発明の符号化装置において、前記補助情報生成手段は、複数のグループに分けられた前記スペクトルデータにつき、高域部の前記各グループにおいて、ピークとなるスペクトルデータを量子化したとき、その値が一定値となるよう計算された前記正規化係数を前記整形のための情報として生成するとしてもよい。
また、前記補助情報生成手段は、複数のグループに分けられた前記スペクトルデータにつき、高域部の前記各グループにおいてピークとなるスペクトルデータを、前記各グループに共通の正規化係数を用いて量子化し、その量子化値を前記整形のための情報として生成するとしてもよい。
【０１５５】
従って、本発明の符号化装置によれば、高域部の各グループ（スケールファクターバンド）につき、それぞれ１つの正規化係数またはピークとなるスペクトルデータの量子化値を補助情報として生成するので、１つの正規化係数または量子化値を表すためにある程度のビット数、例えば８ビットを割り当てたとしても、補助情報のデータ量はわずかである。従って、少ないデータ量で高域部のグループごとに、スペクトルデータの大まかな最大振幅を表すことができる。これにより、本発明の符号化装置によれば、たとえ低転送レートの伝送路であっても、従来と比べてわずかな伝送量の増加で、原音の特徴を備えた高域部音響信号を生成するための情報を伝送することができるので、これを復号化する復号化装置においては、より原音に忠実な音響信号を復元できるという効果がある。
【０１５６】
また、本発明の符号化装置において、前記補助情報生成手段は、複数のグループに分けられた前記スペクトルデータにつき、高域部に属する各グループにおいて、ピークとなるスペクトルデータの周波数位置を前記整形のための情報として生成するとしてもよい。
また、前記スペクトルデータはＭＤＣＴ係数であって、前記補助情報生成手段は、複数のグループに分けられた前記スペクトルデータにつき、高域部の所定周波数位置におけるスペクトルデータの正負を示す符号を前記整形のための情報として生成するとしてもよい。
【０１５７】
従って、本発明の符号化装置によれば、ピークとなるスペクトルデータの周波数位置、あるいは高域部の所定の周波数位置におけるスペクトルデータの正負の符号によって、少ないデータ量で高域部の各グループ（スケールファクターバンド）における大まかなスペクトルの形状を表すことができるので、コピーされたスペクトルデータが高域部のスペクトルにより精度よく近似するよう整形することができるという効果がある。
【０１５８】
また、本発明の符号化装置において、前記補助情報生成手段は、複数のグループに分けられた前記スペクトルデータにつき、高域部の各グループにおいて、当該グループ内のスペクトルと最も近似する低域部のスペクトルを特定する情報を前記低域部スペクトルデータを特定する情報として生成するとしてもよい。
【０１５９】
従って、本発明の符号化装置によれば、高域部スペクトルとよく似た形状のスペクトルが低域部にあった場合には、その低域部のスペクトルを特定して高域部にコピーするだけでよいので、非常に少ないデータ量で高域部スペクトルをより忠実に表すことができるという効果がある。
【図面の簡単な説明】
【図１】本発明の実施の形態における符号化装置及び復号化装置の構成を示すブロック図である。
【図２】本実施の形態の他の構成例である符号化装置及び復号化装置の構成を示すブロック図である。
【図３】図１に示した符号化装置において処理される音響信号の状態変化を示す図である。
【図４】図１に示したストリーム出力部によって補助情報が格納されるビットストリーム中の位置を示す図である。
【図５】図１に示したストリーム出力部が補助情報を格納する場合の他の例を示す図である。
【図６】図１に示した第１の量子化部のスケールファクター決定処理における動作を示すフローチャートである。
【図７】図１に示した第１の量子化部の他のスケールファクター決定処理における動作を示すフローチャートである。
【図８】図１に示した第２の量子化部によって生成される補助情報（スケールファクター）の具体例を示すスペクトル波形図である。
【図９】図１に示した第２の量子化部の補助情報（スケールファクター）計算処理における動作を示すフローチャートである。
【図１０】図１に示した第２の量子化部によって生成される補助情報（量子化値）の具体例を示すスペクトル波形図である。
【図１１】図１に示した第２の量子化部の補助情報（量子化値）計算処理における動作を示すフローチャートである。
【図１２】図１に示した第２の量子化部によって生成される補助情報（位置情報）の具体例を示すスペクトル波形図である。
【図１３】図１に示した第２の量子化部の補助情報（位置情報）計算処理における動作を示すフローチャートである。
【図１４】図１に示した第２の量子化部によって生成される補助情報（サイン情報）の具体例を示すスペクトル波形図である。
【図１５】図１に示した第２の量子化部の補助情報（サイン情報）計算処理における動作を示すフローチャートである。
【図１６】図１に示した第２の量子化部によって生成される補助情報（コピー情報）の作成方法の一例を示すスペクトル波形図である。
【図１７】図１に示した第２の量子化部の補助情報（コピー情報）計算処理における動作を示すフローチャートである。
【図１８】図１に示した第２の量子化部によって生成される補助情報（コピー情報）の作成方法の第２の例を示すスペクトル波形図である。
【図１９】図１に示した第２の量子化部の補助情報（コピー情報）の第２の計算処理における動作を示すフローチャートである。
【図２０】図１に示した第２の逆量子化部によって低域部５１２スペクトルが順方向に高域部にコピーされる手順を示すフローチャートである。
【図２１】図１に示した第２の逆量子化部によって低域部５１２スペクトルが周波数軸方向の逆方向に高域部にコピーされる手順を示すフローチャートである。
【図２２】従来のＭＰＥＧ−２ＡＡＣ方式による符号化装置及び復号化装置の構成を示すブロック図である。
【符号の説明】
１００符号化装置
１１０音響信号入力部
１２０変換部
１３１第１の量子化部
１３２第１の符号化部
１３３第２の量子化部
１３４第２の符号化部
１４０ストリーム出力部
２００復号化装置
２１０ストリーム入力部
２２１第１の復号化部
２２２第１の逆量子化部
２２３第２の復号化部
２２４第２の逆量子化部
２２５逆量子化データ合成部
２３０逆変換部
２４０音響信号出力部
１５２逆量子化部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a high-quality sound encoding and decoding technique for digital audio data.
[0002]
[Prior art]
Currently, various audio compression methods for compressing and encoding audio data have been developed. MPEG-2 Advanced Audio Coding (hereinafter abbreviated as AAC) is one of the methods. Details of AAC are described in a standard document “ISO / IEC 13818-7 (MPEG-2 Advanced Audio Coding, AAC)”.
[0003]
First, a conventional encoding and decoding procedure will be described with reference to FIG. FIG. 22 is a block diagram showing the configuration of a conventional MPEG-2 AAC encoding apparatus 300 and decoding apparatus 400. The encoding apparatus 300 is an apparatus that compresses and encodes an input audio signal based on the MPEG-2 AAC encoding method, and includes an audio signal input unit 310, a conversion unit 320, a quantization unit 331, and an encoding unit 332. And a stream output unit 340.
[0004]
The acoustic signal input unit 310 cuts out digital acoustic data that is an input signal every continuous 1024 samples at a sampling frequency of 44.1 kHz, for example. This 1024-sample coding unit is referred to as a “frame”.
[0005]
The conversion unit 320 converts the sample data on the time axis cut out by the acoustic signal input unit 310 into spectrum data on the frequency axis by MDCT. Note that the spectrum data of 1024 samples converted at this time is classified into a plurality of groups. Each of the groups is set so that each of the plurality of groups includes spectral data of one sample or more. In addition, each group simulates a critical band in human hearing. Each group is called a “scale factor band”.
[0006]
The quantization unit 331 quantizes the spectrum data obtained from the conversion unit 320 with a predetermined number of bits. In MPEG-2 AAC, spectral data in a scale factor band is quantized using one normalization coefficient for each scale factor band. This normalization coefficient is called “scale factor”. The result of quantizing each spectrum data with each scale factor is referred to as a “quantized value”. The encoding unit 332 Huffman-encodes the data quantized by the quantization unit 331, that is, each scale factor and spectrum data quantized using the scale factor into a stream format. At this time, the encoding unit 332 obtains the difference between the scale factors of the scale factor bands adjacent to each other in one frame and encodes the difference and the scale factor of the first scale factor band.
[0007]
The stream output unit 340 converts the encoded signal obtained from the encoding unit 332 into an MPEG-2 AAC bitstream and outputs it. The bit stream output from the encoding device 300 is transmitted to the decoding device 400 via a transmission medium, or recorded on a recording medium such as an optical disk such as a CD or a DVD, a semiconductor, or a hard disk.
[0008]
The decoding device 400 is a device that decodes the bitstream encoded by the encoding device 300, and includes a stream input unit 410, a decoding unit 421, an inverse quantization unit 422, an inverse transform unit 430, and an acoustic signal output. Part 440.
[0009]
The stream input unit 410 inputs the bit stream encoded by the encoding device 300 via a transmission medium or playback from a recording medium, and extracts an encoded signal from the input bit stream. The decoding unit 421 decodes the extracted encoded signal from the stream format into quantized data.
[0010]
The inverse quantization unit 422 performs inverse quantization on the quantized data decoded by the decoding unit 421. In MPEG-2 AAC, Huffman encoded data is decoded. The inverse transform unit 430 converts the spectrum data on the frequency axis obtained by the inverse quantization unit 422 into sample data on the time axis. In MPEG-2 AAC, conversion is performed using IMDCT (Inverse Modified Discrete Cosine Transform). The acoustic signal output unit 440 sequentially combines the sample data on the time axis obtained by the inverse transform unit 430 and outputs the combined data as digital acoustic data.
[0011]
[Problems to be solved by the invention]
In the above method, there is a reproduction band after encoding as one guideline indicating how much the sound quality of the input acoustic data is retained in the encoding of the acoustic data. For example, when the sampling frequency of the input signal is 44.1 kHz, the reproduction band is 22.05 kHz, and the data corresponding to this 22.05 kHz, or 22.05 kHz, can be efficiently encoded without degrading,In addition, by transferring all the encoded data to the decoding device within the range of the transfer rate, it is possible to obtain a high-quality sound signal in the decoding device. That is, on the encoding device sideHigh-quality encodingTheAchievementTo doit can. However, the width of the reproduction band affects the number of spectrum data, and the number of spectrum data affects the amount of information. For example, when the sampling frequency of the input signal is 44.1 kHz, the spectrum data of 1024 samples corresponds to the data for 22.05 kHz, and in order to secure the reproduction band of 22.05 kHz, all of the spectrum data of 1024 samples is transmitted. There is a need to.
[0012]
However, considering a low transfer rate transmission line such as a cellular phone, it is not practical to actually transmit all 1024 samples of spectrum data because the amount of data is too large. In other words, when trying to transfer the entire spectrum data of this reproduction band with the data amount according to the transfer rate, the amount of information that can be allocated to each frequency band becomes small, and as a result, the influence of quantization noise increases, This results in sound quality degradation due to encoding.
[0013]
For this reason, not only MPEG-2 AAC but also many audio signal encoding schemes achieve efficient audio signal transmission by performing auditory weighting on spectral data and not transmitting low-priority data. is doing. According to this, with respect to the reproduction band, in order to improve the encoding accuracy of the low-frequency part having a high auditory priority, a sufficient amount of data is allocated to the low-frequency part encoded information, The part has a high probability of being excluded from transmission.
[0014]
However, in spite of such a contrivance in the MPEG-2 AAC system, further improvement in quality and improvement in compression efficiency are required for encoding of an audio signal. That is, there is a growing demand for transmitting high-frequency acoustic signals even at low transfer rates.
[0015]
An object of the present invention is to provide an encoding device and a decoding device capable of realizing high-quality sound signal encoding and decoding without significantly increasing the amount of information after encoding.
[0016]
[Means for Solving the Problems]
In order to achieve the above object, the encoding device of the present invention is an encoding device that encodes an input acoustic signal, and is divided into a plurality of groups obtained by converting the input acoustic signal for a certain period of time. Spectral dataFrom,A normalization coefficient for normalizing the spectrum data in each group, a quantization value obtained by quantizing each spectrum data in each group using the normalization coefficient, and positive / negative of each spectrum data It is represented by four types of information including a positive or negative sign to represent and a position on the frequency axis of each spectrum data.Low frequency rangedataThe first encoding means for encoding the frequency high frequency sectionAs information for identifying low-frequency spectrum data approximated to the spectrum data in each group of the above and information for shaping the identified low-frequency spectrum data, high-frequency spectrum dataFeatures of, Including information for shaping expressed by one or more types of information among the four types of informationAuxiliary information generating means for generating auxiliary information, second encoding means for encoding the generated auxiliary information, data encoded by the first encoding means, and encoding by the second encoding means Output means for outputting the converted data. In the above encoding apparatus of the present invention, the auxiliary information generating means has the characteristics of the high frequency part of the frequency among the spectrum data divided into a plurality of groups obtained by converting the input acoustic signal for a predetermined time., Expressed with less information than the low rangeAuxiliary information is generated, and the second encoding means encodes the generated auxiliary information.
[0017]
In order to achieve the above object, the decoding device of the present invention provides:From the spectrum data divided into a plurality of groups obtained by converting the input acoustic signal for a certain time, a normalization coefficient for normalizing the spectrum data in each group, and using the normalization coefficient Four types of information including a quantized value obtained by quantizing each spectrum data of each group, a positive or negative sign representing the positive or negative of each spectrum data, and a position on the frequency axis of each spectrum data The first encoded data obtained by encoding the low frequency part data of the frequency represented by the information, the information specifying the low frequency part spectrum data approximate to the spectrum data in each group of the frequency high frequency part, As information for shaping the specified low-frequency spectrum data, the characteristics of the high-frequency spectrum data are classified into one type of the four types of information. And a second encoded data auxiliary information including the information for the shaping, expressed above three following information obtained by encodingA decoding device that inputs and decodes encoded data, the encoded data separating means for separating the second encoded data from the input encoded data, and the first encoded data in the input encoded data And a first decoding means for outputting spectral data representing a low frequency part of the frequency and separated from the input encoded dataTheDecoding the second encoded data,The auxiliary informationInformation specifying the low-frequency spectrum data inOn the basis of theBased on the information for shaping in the auxiliary information, the specified low band spectrum data among the spectrum data output by the first decoding means is copied to each group of the high band section. By shaping the copied spectral dataA second decoding unit that generates and outputs spectrum data representing a high frequency part of the frequency, a spectrum data output by the first decoding unit, and a spectrum data output by the second decoding unit. And an acoustic signal output means for synthesizing and converting and outputting as an acoustic signal on the time axis. In the decoding apparatus of the present invention, the encoded data separation means is configured to input the encoded data from the input encoded data.Second encoded dataAnd the second decoding means, MinutesSeparatedTheSaidSecond encoded dataDecryptIncludes information for identifying the low-frequency spectrum data and information for shapingThe auxiliary information is generated, and spectrum data representing a high frequency part is generated and output based on the generated auxiliary information.
[0018]
Note that the present invention can be realized as a broadcasting system including a transmission device including the encoding device of the present invention and a reception device including the decoding device of the present invention, or a characteristic configuration of the encoding device and the decoding device. It can be realized as an encoding method and a decoding method using elements as processing steps, or as a program for causing a computer to execute these steps. Needless to say, the program can be distributed via a computer-readable recording medium such as a CD-ROM or a transmission medium such as a communication path.
[0019]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an encoding device 100 and a decoding device 200 according to an embodiment of the present invention will be described in detail with reference to the drawings. In the embodiment of the present invention, MPEG-2 AAC will be described as an example of the conventional system. FIG. 1 is a block diagram showing a configuration of encoding apparatus 100 and decoding apparatus 200 in the embodiment of the present invention.
<Encoder 100>
[0020]
The encoding device 100 compresses and encodes the low frequency portion of the input audio signal based on the MPEG-2 AAC encoding method, and generates auxiliary information representing the characteristics of the high frequency audio signal. An apparatus that compresses and encodes and outputs the low-band encoded bitstream, which includes an acoustic signal input unit 110, a conversion unit 120, a first quantization unit 131, a first encoding unit 132, The second quantization unit 133, the second encoding unit 134, and the stream output unit 140 are configured.
[0021]
The acoustic signal input unit 110 converts digital acoustic data, which is an input signal similar to MPEG-2 AAC sampled at a sampling frequency of 44.1 kHz, in a cycle of about 22.7 msec (every 1024 samples). Cut out 512 samples in an overlapping manner.
[0022]
The conversion unit 120 converts the sample data on the time axis cut out by the acoustic signal input unit 110 into spectrum data on the frequency axis as in the conventional case. In MPEG-2 AAC, the time axis data of 2048 samples is converted into spectral data of 2048 samples by using MDCT (Modified Discrete Cosine Transform) to overlap the data of 1024 samples with the data of 512 samples before and after. Since MDCT produces symmetrical spectral data, only one 1024 samples need be encoded.
[0023]
Furthermore, the conversion unit 120 classifies the converted spectrum data of 1024 samples into a plurality of scale factor bands each including spectrum data of one sample or more (practically a multiple of 4). In this standard, the number of samples (spectral data) included in each scale factor band is determined according to the frequency in this standard. The low-frequency part is finely divided for each small number of samples, and the high-frequency part It is largely divided to include a large number of samples. In MPEG-2 AAC, the number of scale factor bands corresponding to one frame of spectrum data is also determined according to the sampling frequency. For example, when the sampling frequency is 44.1 kHz, the number of scale factor bands included in one frame is 49, and spectrum data of 1024 samples is included in the 49 scale factor bands. On the other hand, of the scale factor bands determined as described above, which scale factor band is transmitted is not particularly defined, and the most preferable scale factor band is selected according to the transfer rate of the transmission line. That's fine. For example, when the transfer rate of the transmission path is 96 kbps, only the low band 40 scale factor band (640 samples) of one frame may be selected and transmitted.
[0024]
In the present embodiment, a case will be described in which the conversion unit 120 classifies the converted spectral data into uniquely defined division methods and numbers of scale factor bands.
[0025]
The first quantizing unit 131 receives the spectral data output from the converting unit 120, determines the scale factor for each scale factor band in the low frequency part of the input spectral data, and uses the determined scale factor. The spectrum in the scale factor band is quantized, and the quantized value as the quantization result is output to the first encoding unit 132. For example, in this case, since the sampling frequency of the input signal is 44.1 kHz, the reproduction band is 22.05 kHz, but the low frequency portion of this, for example, the band of 11.25 kHz or less, is obtained from the spectrum data at each scale factor. For example, the scale factor is calculated so that the obtained quantized value is represented by a numerical value of 4 bits or less, and each spectrum in the scale factor band is normalized using the scale factor and then quantized.
[0026]
The first encoding unit 132 is a quantized value in each scale factor band corresponding to 512 samples on the low frequency side of the data quantized by the first quantizing unit 131, that is, of all spectrum data. The scale factor used for the quantization and the like are Huffman-encoded as a first encoded signal and converted into a predetermined stream format.
[0027]
The second quantization unit 133 receives the spectrum data output from the conversion unit 120, and is a band that is not quantized by the first quantization unit 131, that is, 11.025 kHz.OverCalculate and output only auxiliary information in the high band.
[0028]
Auxiliary information refers to simplified information that represents a high-frequency acoustic signal that is calculated based on spectral data in the high-frequency region and that is not transmitted in the conventional method. In other words, it is information that represents the characteristics of the high frequency part of the spectrum data obtained by converting the input acoustic signal for a certain period of time. Specifically, it is the absolute maximum within the scale factor band of the high frequency part. This is the scale factor for each scale factor band and its quantized value such that the quantized value of the spectrum data (spectrum data with the maximum absolute value) is 1, and the absolute maximum spectrum within each scale factor band. This is the quantized value of the absolute maximum spectrum data for each scale factor band when the scale factor that is common to each scale factor band in the high frequency region is determined, and the spectrum at the position determined in advance in the high frequency region. In addition, the sign of the low frequency region is similar to the spectrum of the high frequency region. , Etc. information indicating the copying method when copying the torque representative of the spectrum of the higher frequency band. Furthermore, noise information indicating amplitude such as white noise mixed from the low frequency range to the high frequency range may be added to the auxiliary information as described above, not only in the high frequency range. Good.
[0029]
The second encoding unit 134 Huffman-encodes the auxiliary information output from the second quantization unit 133 into a predetermined stream format, and outputs it as second encoded information.
[0030]
The stream output unit 140 adds header information and other sub information as necessary to the first encoded signal output from the first encoding unit 132 and converts the first encoded signal into an MPEG-2 AAC encoded bit stream. The second encoded signal output from the second encoding unit 134 is stored in an area in the bitstream that is ignored by the conventional decoding device or whose operation is not defined.
[0031]
Specifically, the stream output unit 140 stores the encoded signal output from the second encoding unit 134 in a Fill Element or a Data Stream Element in an MPEG-2 AAC encoded bit stream. The bit stream output from the encoding apparatus 100 is transmitted to the decoding apparatus 200 via a transmission medium, or recorded on a recording medium such as an optical disk such as a CD or a DVD, a semiconductor, or a hard disk.
[0032]
In MPEG-2 AAC, the conversion length of MDCT can be changed according to the input acoustic signal. A conversion length of 2048 samples is called a LONG block, a conversion length of 256 samples is called a SHORT block, and these are collectively called a block size. This description will be given for the LONG block unless otherwise noted, but the same processing can be performed for the SHORT block.
[0033]
In actual MPEG-2 AAC encoding processing, tools such as Gain Control, TNS (TEMPORAL NOISE SHAPING), auditory psychology model, M / S Stereo, Intensity Stereo, Prediction, block size switching, bit reservoir Etc. may be used.
<Decryption device 200>
[0034]
The decoding device 200 is a device that restores wideband acoustic data to which a high frequency part is added based on the auxiliary information from an input encoded bit stream, and includes a stream input unit 210, a first decoding unit 221, a first inverse quantization unit 222, a second decoding unit 223, a second inverse quantization unit 224, an inverse quantized data synthesis unit 225, an inverse transform unit 230, and an acoustic signal output unit 240. .
[0035]
The stream input unit 210 inputs a bit stream generated by the encoding apparatus 100 through a transmission medium or reproduced from a recording medium, and is stored in an area to be decoded by a conventional decoding apparatus. 1 encoded signal and a second encoded signal stored in an area that is ignored by a conventional decoding apparatus or is not defined for the information, and each of the first decoding units is extracted. It outputs to 221 and the 2nd decoding part 223.
[0036]
The first decoding unit 221 receives the first encoded signal output from the stream input unit 210, and decodes the Huffman encoded data from the stream format into quantized data. The first inverse quantization unit 222 inversely quantizes the quantized data decoded by the first decoding unit 221 and outputs spectrum data of a low frequency part. Here, the number of samples of the spectrum data output from the first inverse quantization unit 222 is 512 samples (the maximum number of samples is 1024), which is a reproduction band of 11.025 kHz.(Maximum playback bandwidth 22 . 05kHz)Represents.
[0037]
The second decoding unit 223 receives the second encoded signal output from the stream input unit 210, decodes the input second encoded signal, and outputs auxiliary information. The second inverse quantization unit 224 copies noise, for example, a part or all of the low-frequency spectrum data in a predetermined procedure based on the spectrum data output from the first inverse quantization unit 222. Alternatively, white noise, pink noise, or the like is generated, and the noise is shaped based on the auxiliary information output from the second decoding unit 223, and high-frequency spectrum data is output.
[0038]
Specifically, the second inverse quantization unit 224 copies the low-frequency part spectrum data output from the first inverse quantization unit 222 to the high-frequency part, and the high-frequency part scale factor band. The ratio between the absolute maximum value of the spectrum data copied in each band and the value obtained by dequantizing the quantized value “1” using the scale factor value corresponding to that band described in the auxiliary information By multiplying each spectrum data in the band by using as a coefficient, the spectrum in the high frequency band is restored. Further, the second inverse quantization unit 224 generates white noise having a predetermined amplitude in advance, adjusts the amplitude according to the noise information in the auxiliary information, and adds the white noise to the restored spectrum to add a high frequency portion. Output the spectral data.
[0039]
The inverse quantized data synthesizer 225 synthesizes the spectrum data output from the first inverse quantizer 222 and the spectrum data output from the second inverse quantizer 224. The inverse conversion unit 230 converts the spectrum data on the frequency axis output from the inverse quantized data synthesis unit 225 into sample data of 1024 samples on the time axis using IMDCT according to MPEG-2 AAC. The acoustic signal output unit 240 sequentially combines the sample data on the time axis obtained by the inverse conversion unit 230 and outputs the combined data as digital acoustic data.
[0040]
As described above, according to the present embodiment, the low-frequency part performs conventional encoding, and the high-frequency part is encoded with an extremely small amount of information, so that the total amount of information is significantly larger than the conventional one. Thus, it is possible to encode a high-quality acoustic signal within a range that does not increase significantly.
[0041]
The configuration of the encoding apparatus 100 and the decoding apparatus 200 according to the present embodiment is the same as that of the conventional encoding apparatus 300, except that the second quantization unit 133 and the second encoding unit 134 are added, and the conventional decoding apparatus is used. Since only the second decoding unit 223 and the second inverse quantization unit 224 are added to the apparatus 400, it can be realized without significantly changing the configuration of the existing encoding apparatus 300 and decoding apparatus 400. effective.
[0042]
In addition, there is an effect that the bit stream generated by the encoding apparatus 300 according to the present embodiment can also be decoded by the conventional decoding apparatus 400.
In the present embodiment, MPEG-2 AAC has been described as an example, but it is apparent that the present invention can be applied to other acoustic coding schemes and can also be applied to new acoustic coding schemes that do not exist.
[0043]
In the present embodiment, the input data in the second quantization unit 133 is only the spectrum data output from the conversion unit 120. However, the present invention is not limited to this, and the first quantization may be performed. A value obtained by dequantizing the output of the unit 131 may be input separately.
FIG. 2 is a block diagram showing configurations of encoding apparatus 101 and decoding apparatus 200, which are other configuration examples of the present embodiment. Since the same configuration as that in FIG. 1 has already been described, the same reference numerals as those in FIG.
[0044]
The difference between the encoding apparatus 101 and the encoding apparatus 100 is that an inverse quantization unit is newly added.152It is to provide. In this encoding apparatus 101, the first quantization unit 151 quantizes all the 1024-point spectrum data output by the conversion unit 120, outputs the quantization result to the inverse quantization unit 152, and among them, The quantization result of the low frequency part 512 points is output to the first encoding unit 132.
[0045]
The inverse quantization unit 152 inversely quantizes the quantized value once quantized by the first quantization unit 151, and outputs the spectrum data as the inverse quantization result to the second quantization unit 153.
The second quantization unit 153 does not input the spectrum data from the conversion unit 120, but inputs the spectrum data which is the inverse quantization result of the inverse quantization unit 152. Based on the input spectrum data, the second quantization unit 153 inputs the spectrum data. Generate auxiliary information.
[0046]
Here, the second quantization unit 153 does not input the spectrum data from the transform unit 120, and generates auxiliary information for the high frequency band based on the spectrum data from the inverse quantization unit 152. Is not limited to this example, and the second quantizing unit 153 may input spectral data from the transforming unit 120 for a certain part and input spectral data from the inverse quantizing unit 152 for a certain part. Good.
[0047]
FIG. 3 is a diagram showing a change in state of an acoustic signal processed in the encoding apparatus 100 shown in FIG. FIG. 3A is a waveform diagram showing 1024 sample data on the time axis cut out by the acoustic signal input unit 110 shown in FIG. FIG. 3B is a waveform diagram showing spectrum data on the frequency axis after the sample data on the time axis is converted by the MDCT of the conversion unit 120 shown in FIG. In FIGS. 3A and 3B, the sample data and spectrum data are shown as analog waveforms, but in actuality, both are digital signals. The same applies to the following waveform diagrams.
[0048]
The acoustic signal input unit 110 receives a digital acoustic signal sampled at 44.1 kHz. The acoustic signal input unit 110 overlaps and cuts out 512 samples before and after every 1024 samples from the input signal, and outputs them to the conversion unit 120. The conversion unit 120 performs MDCT on the data of a total of 2048 samples, but the spectrum obtained by MDCT has a symmetrical waveform, and thus generates spectrum data as shown in FIG. 3B corresponding to half of the 1024 samples. To do.
[0049]
In the spectral data shown in FIG. 3 (b), the vertical axis indicates the frequency spectrum value, that is, the amount (size) of the frequency component of the acoustic signal represented by the voltage value of 1024 samples in FIG. 3 (a). , 1024 points corresponding to the number of samples. Further, since the sampling frequency of the digital audio signal input to the encoding device 100 is 44.1 kHz, the reproduction band of the spectrum data is 22.05 kHz. Furthermore, since the spectrum obtained by MDCT may take a negative value as shown in FIG. 3B, when the spectrum obtained by MDCT is encoded, the sign of the spectrum is also added. It is necessary to make it. Hereinafter, in order to avoid confusion with the encoding code, information representing the positive and negative signs of the spectrum data is referred to as “sign information”.
[0050]
FIG. 4 is a diagram illustrating a position in the bit stream in which auxiliary information is stored by the stream output unit 140 illustrated in FIG. In the figure, auxiliary information representing the spectrum of the high frequency band is encoded and then stored as a second encoded signal in a region that is not recognized as an acoustic encoded signal in the bitstream.
[0051]
In FIG. 4A, the hatched portion is, for example, a region (Fill Element) that is filled with “0” in order to match the data length of the bit stream. Even if the information, that is, the second encoded signal is stored, the conventional decoding apparatus 400 does not recognize the encoded signal to be decoded and ignores it.
[0052]
Also, the hatched portion in FIG. 4B is an area called, for example, Data Stream Element (DSE), and this area is a physical extension such as a bit length according to the MPEG-2 AAC standard for future expansion. Only the structure is a defined area. Similar to the Fill Element, this area is read even if auxiliary information representing the spectrum of the high frequency part is stored here or ignored by the conventional decoding device 400 or the data is read. This is an area where the operation of the decryption apparatus 400 for data is not defined.
[0053]
In the above description, the second encoded signal is stored in an area in the bitstream that is ignored by the conventional decoding apparatus 400 according to the MPEG-2 AAC standard. The second encoded signal may be incorporated at a predetermined position in the first encoded signal, or may be incorporated across both. In addition, since the second encoded signal is stored in the bit stream, it is not necessary to secure a continuous area in the header and the first encoded signal. That is, as shown in FIG. 4C, the second encoded signal may be incorporated discontinuously in the header information and the first encoded information.
[0054]
FIG. 5 is a diagram illustrating another example when the stream output unit 140 illustrated in FIG. 1 stores auxiliary information. FIG. 5A shows a stream 1 in which only the first encoded signal is stored continuously for each frame. FIG. 5B shows the stream 2 in which only the second encoded signal in which the auxiliary information is encoded is continuously stored for each frame corresponding to the stream 1.
[0055]
The stream output unit 140 may store the second encoded signal in a stream 2 that is completely different from the stream 1 that is a bit stream in which the first encoded signal is stored. For example, stream 1 and stream 2 are bit streams transmitted on different channels.
[0056]
In this way, by transmitting the first encoded signal and the second encoded signal in completely different bit streams, a low-frequency portion representing basic information of the input acoustic signal is transmitted or accumulated in advance, There is an effect that the high frequency band information can be added later if necessary.
[0057]
The operations of the encoding apparatus 100 and the decoding apparatus 200 configured as described above will be described below with reference to the flowcharts of FIGS. 6, 7, 9, 11, 13, 15, 17, and 19 to 21. Will be described.
[0058]
FIG. 6 is a flowchart showing an operation in the scale factor determination process of the first quantization unit shown in FIG. The first quantizing unit 131 first determines a common scale factor for each scale factor band as an initial value of the scale factor (S91), and should be transmitted as acoustic data for one frame using the scale factor. All of the low-frequency spectrum data is quantized, the difference before and after the obtained scale factor is obtained, and the difference, the head scale factor, and each quantized value are Huffman encoded (S92). Note that quantization and encoding are performed only for counting the number of bits, and therefore, for simplification of processing, only data is performed and information such as a header is not added. Next, the first quantizing unit 131 determines whether or not the number of bits of data after Huffman coding exceeds a predetermined number of bits (S93), and if it exceeds, the initial value of the scale factor is decreased ( S101) Using the scale factor value, quantization and Huffman coding are performed again for the same lowband spectrum data (S92), and the lowband coding data for one frame after Huffman coding is performed. It is determined whether or not the number of bits exceeds a predetermined number of bits (S93), and this process is repeated until the number of bits becomes equal to or less than the predetermined number of bits.
[0059]
If the number of bits of the low band encoded data does not exceed the predetermined number of bits, the first quantization unit 131 repeats the following processing for each scale factor band, and determines the scale factor of each scale factor band. (S94). First, each quantized value in the scale factor band is inversely quantized (S95), and the difference between each absolute value between each inverse quantized value and the corresponding original spectrum data is obtained and summed (S96). Further, it is determined whether or not the sum of the obtained differences is within the allowable range (S97), and if it is within the allowable range, the above processing is repeated for the next scale factor band (S94 to S98). . On the other hand, if the allowable range is exceeded, the scale factor value is increased and the spectrum data of the scale factor band is quantized (S100), and the quantized value is inversely quantized (S95). The difference in absolute value between the value and the corresponding spectrum data is summed (S96). Further, it is determined whether or not the sum of the differences is within the allowable range (S97). If the difference exceeds the allowable range, the scale factor is sequentially increased until it is within the allowable range (S100), and the above processing (S95 to S97 S100) is repeated.
[0060]
The first quantizing unit 131 is configured such that, for all scale factor bands, the sum of absolute value differences between the values obtained by dequantizing the quantized values in the scale factor bands and the original spectrum data falls within the allowable range. When the scale factor is determined (S98), the low-frequency spectrum data for one frame is quantized again using the determined scale factor, and the difference between each scale factor, the head scale factor, and each quantized value are obtained. Huffman coding is performed, and it is determined whether or not the number of bits of the low-frequency part encoded data exceeds a predetermined number of bits (S99). If the number of bits of the low-frequency encoded data exceeds the predetermined number of bits, the initial value of the scale factor is lowered until it becomes equal to or less than the predetermined number of bits (S101), and then the scale within each scale factor band The process for determining the factor (S94 to S98) is repeated. If the number of bits of the low frequency band encoded data does not exceed the predetermined number of bits (S99), the value of each scale factor at that time is determined as the scale factor of each scale factor band.
[0061]
Whether or not the sum of the absolute value difference between the value obtained by dequantizing the quantized value in the scale factor band and the original spectrum data is within the allowable range is determined based on data such as a psychoacoustic model. Is called.
[0062]
Also, here, the initial value of the scale factor is set to a relatively large value, and when the number of bits of the low band encoded data after Huffman coding exceeds a predetermined number of bits, the scale factor is sequentially changed. Although the scale factor is determined by a method of lowering the initial value, it is not always necessary to do so. For example, when the initial value of the scale factor is set to a low value in advance, the initial value is gradually increased, and the total number of bits of the low-frequency encoded data first exceeds the predetermined number of bits. Thus, the scale factor of each scale factor band may be determined using the initial value of the scale factor set immediately before.
[0063]
Furthermore, although the scale factor of each scale factor band is determined here so that the number of bits of the entire low-frequency encoded data for one frame does not exceed a predetermined number of bits, this need not necessarily be done. For example, in each scale factor band, the scale factor may be determined so that each quantized value in the scale factor band does not exceed a predetermined number of bits. Hereinafter, the operation of the first quantization unit 131 in this process will be described with reference to FIG.
[0064]
FIG. 7 is a flowchart showing an operation in another scale factor determination process of the first quantizing unit 131 shown in FIG. The first quantizing unit 131 calculates the scale factor according to the following procedure for all the scale factor bands in the low frequency range to be encoded (S1). The first quantizing unit 131 calculates the scale factor for all the spectrum data in each scale factor band according to the following procedure (S2).
[0065]
First, the first quantizing unit 131 quantizes spectrum data based on a formula with a value of a predetermined scale factor (S3), and a predetermined number of bits given to represent the quantized value. For example, it is determined whether or not it exceeds 4 bits (S4).
[0066]
As a result of the determination, if the quantized value exceeds 4 bits, the scale factor value is adjusted (S8), and the same spectrum data is quantized with the adjusted scale factor value (S3). The first quantization unit 131 determines whether or not the obtained quantized value exceeds 4 bits (S4), and adjusts the scale factor until the quantized value of the spectrum data becomes a value of 4 bits or less. (S8) and quantization by the scale factor after adjustment (S3) are repeated.
[0067]
As a result of the determination, if the quantized value is 4 bits or less, the next spectrum data is quantized with a value of a predetermined scale factor (S3).
When the quantized values of all the spectral data in one scale factor band are 4 bits or less (S5), the first quantizing unit 131 converts the scale factor value at that time into the scale factor of the scale factor band. (S6).
[0068]
Furthermore, when the first quantizing unit 131 determines scale factors for all scale factor bands (S7), the process is terminated.
With the above processing, one scale factor is determined for each of all the scale factor bands in the low frequency band to be encoded. The first quantization unit 131 quantizes the spectrum data of the low frequency band using the scale factor determined in this way, and a 4-bit quantized value as a quantization result and the 8-bit scale factor. Are output to the first encoding unit 132.
[0069]
FIG. 8 is a spectrum waveform diagram showing a specific example of the auxiliary information (scale factor) generated by the second quantization unit 133 shown in FIG. In FIG. 8, the delimiters shown on the low-frequency part frequency axis indicate the delimiters of the scale factor bands defined in the present embodiment. In addition, a break indicated by a broken line in the frequency direction in the high frequency part indicates a scale factor band break defined in the present embodiment. The same applies to the following waveform diagrams.
[0070]
Of the spectrum data output from the conversion unit 120, a low-frequency part having a reproduction band of 11.025 kHz or less shown by a solid line waveform in FIG. 8 is output to the first quantization unit 131 and quantized as usual. On the other hand, the high frequency region up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz shown by the broken line waveform in FIG. 8 is represented by auxiliary information (scale factor) calculated by the second quantization unit 133. . Hereinafter, the calculation procedure of the auxiliary information (scale factor) of the second quantization unit 133 will be described with reference to the flowchart of FIG. 9 using the specific example of FIG.
[0071]
FIG. 9 is a flowchart showing an operation in the auxiliary information (scale factor) calculation process of the second quantization unit 133 shown in FIG.
The second quantization unit 133 sets the quantization value of the absolute maximum spectrum data in each scale factor band to “1” for all scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz. Is calculated according to the following procedure (S11).
[0072]
The second quantization unit 133 specifies absolute maximum spectrum data (peak) in the first scale factor band in the high frequency region exceeding the reproduction band of 11.0525 kHz (S12). In the specific example of FIG. 8, it is assumed that the peak position specified in the first scale factor band is {circle around (1)} and the peak value at that time is “256”.
[0073]
The second quantizing unit 133 applies the peak value “256” and the initial scale factor value to the formula for calculating the quantized value in the same manner as the procedure shown in the flowchart of FIG. The value of the scale factor sf with which the quantized value to be “1” is calculated (S13). For example, in this case, a value of the scale factor sf that sets the quantization value of the peak value “256” to “1”, for example, sf = 24, is calculated.
[0074]
When the scale factor value sf = 24 for setting the peak quantized value to “1” is obtained for the first scale factor band (S14), the second quantizing unit 133 determines the spectrum for the next scale factor band. The peak of the data is specified (S12). For example, when the specified peak position is (2) and the value is “312”, the quantized value of the peak value “312” is “1”. A value of the scale factor sf, for example, sf = 32 is calculated (S13).
[0075]
Similarly, the second quantizing unit 133 sets the value of the scale factor sf for setting the quantized value of the value “288” of the peak (3) to “1” for the third scale factor band in the high frequency range, For example, sf = 26 is calculated, and the value of the scale factor sf, for example, sf = 18, for setting the quantized value of the value “203” of the peak (4) to “1” is calculated for the fourth scale factor band.
[0076]
In this way, when the scale factor for setting the quantized value of the peak value to “1” is calculated for all the scale factor bands in the high frequency part (S14), the second quantizing unit 133 is The obtained scale factor of each scale factor band is output to the second encoding unit 134 as auxiliary information of the high frequency band, and the process ends.
[0077]
As described above, the auxiliary information (scale factor) is generated by the second quantizing unit 133. The auxiliary information (scale factor) is obtained by converting the high-frequency part represented by the 512-point spectrum data into each of the high-frequency parts. If the value of the scale factor is represented by a value from 0 to 255, it can be represented by 8 bits for each scale factor band (four in this case) in the high frequency region. Further, if the difference between the scale factors is Huffman-encoded, the amount of data may be further reduced. On the other hand, if the spectrum data of 512 points in the high frequency part is quantized and Huffman encoded by the conventional method as in the low frequency part, it is predicted that the data amount will be at least about 150 bits. Therefore, this auxiliary information shows only one scale factor for each scale factor band in the high frequency part, but the data amount is greatly reduced as compared with the case where the high frequency part is quantized according to the conventional method. You can see that
[0078]
The scale factor is a value that is substantially proportional to the peak value (absolute value) in each scale factor band, and is one of the spectrum data that takes a constant value at 512 points in the high frequency region or the spectrum data in the low frequency region. It can be said that the spectrum data obtained by multiplying a copy of all or all copies by the scale factor roughly restores the spectrum data obtained based on the input acoustic signal. Also, for each scale factor band, the coefficient is the ratio between the absolute maximum value of the spectral data copied in the band and the value obtained by dequantizing the quantized value “1” using the scale factor value corresponding to the band. As described above, the spectral data can be restored with higher accuracy by multiplying each spectral data in the band. Furthermore, since the difference in the waveform of the high frequency part is not as clearly audibly identified as the low frequency part, the auxiliary information obtained in this way is sufficient as information representing the waveform of the high frequency part. It can be said.
[0079]
Here, the scale factor is calculated so that the quantized value of the spectrum data in each scale factor band in the high frequency region is “1”. However, the scale factor is not necessarily “1” and is set to another value. You may keep it.
[0080]
In this example, only the scale factor is encoded as auxiliary information. However, the present invention is not limited to this. The quantization value, the characteristic spectrum position information, the sign information indicating the positive and negative signs of the spectrum, the noise generation method, etc. May be encoded together. Two or more of these may be combined and encoded. In this case, it is particularly effective if the auxiliary information is encoded in combination with the scale factor such as a coefficient representing the amplitude ratio and the position of the absolute maximum spectrum data.
[0081]
FIG. 10 is a spectrum waveform diagram showing a specific example of the auxiliary information (quantized value) generated by the second quantizing unit 133 shown in FIG. FIG. 11 is a flowchart showing an operation in the auxiliary information (quantization value) calculation process of the second quantization unit 133 shown in FIG.
[0082]
The second quantization unit 133 predetermines a common scale factor value, for example, “18”, for all scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz. Using the scale factor value “18”, the quantization value of the absolute maximum value spectrum data (peak) in the scale factor band is calculated for each scale factor band (S21).
[0083]
The second quantization unit 133 specifies absolute maximum spectrum data (peak) in the first scale factor band in the high frequency region exceeding the reproduction band of 11.0525 kHz (S22). In the specific example of FIG. 10, it is assumed that the peak position specified in the first scale factor band is (1) and the peak value at that time is “256”.
[0084]
The second quantization unit 133 assigns a predetermined common scale factor value “18” and peak value “256” to the formula for calculating the quantization value, and calculates the quantization value (S23). For example, in this case, when the peak value “256” is quantized with the scale factor value “18”, the quantized value “6” is calculated.
[0085]
When the quantized value “6” of the peak value “256” is obtained for the first scale factor band (S24), the second quantizing unit 133 specifies the peak of the spectrum data for the next scale factor band. (S22) For example, if the specified peak position is (2) and the value is “312”, the quantized value of the peak value “312” with the scale factor value “18”, for example, “10” is calculated (S23).
[0086]
Similarly, the second quantizing unit 133 sets the quantized value “9” of the peak “3” value “288” with the scale factor value “18” for the third scale factor band in the high frequency region. ”And the quantized value“ 5 ”of the value“ 203 ”of the peak (4) with the scale factor value“ 18 ”is calculated for the fourth scale factor band.
[0087]
In this way, when the quantized value of the peak value when the scale factor is fixed to “18” is calculated for all the scale factor bands in the high frequency region (S24), the second quantizing unit 133 Then, the quantized value of each scale factor band obtained by the calculation is output to the second encoding unit 134 as auxiliary information of the high frequency part, and the process is terminated.
[0088]
As described above, the auxiliary information (quantized value) is generated by the second quantizing unit 133. The auxiliary information (quantized value) is obtained by converting the high-frequency part represented by the 512-point spectrum data. Each of the four scale factor bands is represented by a 4-bit quantized value. On the other hand, in the auxiliary information (scale factor) described above, the high frequency region is expressed by an 8-bit scale factor for each of the four scale factor bands. It has been reduced more. This quantized value roughly represents the amplitude of the peak value (absolute value) in each scale factor band, and is one of spectral data that takes a constant value at 512 points in the high frequency region or one of the spectral data in the low frequency region. Even spectral data obtained by simply multiplying a copy of all or part of the copy can be said to roughly restore the spectral data obtained based on the input acoustic signal. Also, for each scale factor band, the ratio between the absolute maximum value of the spectrum data copied in the band and the value obtained by dequantizing the quantized value corresponding to that band using a predetermined scale factor value By multiplying each spectrum data in the band by using as a coefficient, the spectrum data can be restored with higher accuracy.
[0089]
Here, the scale factor value corresponding to the quantized value transmitted as the second encoded information is set in advance, but the optimum scale factor value is calculated and added to the second encoded information. May be transmitted. For example, if the scale factor is selected so that the maximum value of the quantized value is 7, the number of bits representing the quantized value is only 3 bits, so that the amount of information necessary for transmitting the quantized value can be reduced. .
[0090]
In addition, only the quantized value or only the quantized value and the scale factor are encoded as auxiliary information. However, the present invention is not limited to this. The scale factor, the characteristic spectrum position information, and the spectrum data sign information may be used. In addition, a noise generation method or the like may be encoded. Two or more of these may be combined and encoded.
[0091]
FIG. 12 is a spectrum waveform diagram showing a specific example of auxiliary information (position information) generated by the second quantization unit 133 shown in FIG. FIG. 13 is a flowchart showing an operation in the auxiliary information (position information) calculation process of the second quantization unit 133 shown in FIG.
[0092]
The second quantization unit 133 determines the position of the absolute maximum spectral data in each scale factor band for all the scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz according to the following procedure. Specify (S31).
[0093]
The second quantization unit 133 specifies absolute maximum spectrum data (peak) in the first scale factor band of the high frequency region exceeding the reproduction band of 11.0525 kHz (S32). In the specific example of FIG. 12, it is assumed that the position of the peak specified in the first scale factor band is {circle around (1)} and is the 22nd spectrum data from the head of this scale factor band. The second quantization unit 133 holds the position of the identified peak “the 22nd spectrum data from the head of the scale factor band” (S33).
[0094]
When the peak position is specified and held for the first scale factor band (S34), the second quantization unit 133 specifies the peak of the spectrum data for the next scale factor band (S32). For example, assume that the specified peak position is {circle around (2)} and is the 60th spectrum data from the head of the scale factor band. The second quantization unit 133 holds the specified peak position “60th spectrum data from the head of the scale factor band” (S33).
[0095]
Similarly, the second quantizing unit 133 specifies and holds the position “peak data at the beginning of the scale factor band” of the peak (3) for the third scale factor band in the high frequency part, For the fourth scale factor band, the position of peak (4) “25th spectrum data from the beginning of the scale factor band” is specified and held.
[0096]
In this way, when the peak positions are specified and held for all the scale factor bands in the high frequency region (S34), the second quantizing unit 133 holds the peak of each scale factor band that has been held. Is output to the second encoding unit 134 as auxiliary information of the high frequency region, and the process is terminated.
[0097]
As described above, auxiliary information (position information) is generated by the second quantizing unit 133. This auxiliary information (position information) is obtained by converting the high-frequency part represented by 512 points of spectral data into 4 Each scale factor band is represented by 6-bit position information.
[0098]
In this case, in the decoding apparatus 200, the second inverse quantization unit 224 receives a part or all of the spectrum data for 512 samples in the low band part as auxiliary information (from the second decoding unit 223). The data is copied as 512 sample data on the high frequency side according to the position information.
In the copying procedure, based on the peak information of the spectrum data in one or more scale factor bands, similar data is extracted from the spectrum data output from the first inverse quantization unit 222, and part or all of the data is extracted. This is achieved by copying.
[0099]
The second inverse quantization unit 224 adjusts the amplitude of the copied spectral data as necessary. Amplitude adjustment is achieved by multiplying each spectral data by a predetermined coefficient, for example, “0.5”. This coefficient may be a fixed value, may be changed for each band or scale factor band, and may be changed according to the spectrum data output from the first inverse quantization unit 222.
[0100]
In the above description, a predetermined coefficient is used, but the value of this coefficient may be added to the second encoded information as auxiliary information. Alternatively, the scale factor value may be added to the second encoded information as a coefficient, or the quantized value of the peak in the scale factor band may be added to the second encoded information as the coefficient. The amplitude adjustment method is not limited to this, and other methods may be used.
[0101]
Here, only position information or only position information and coefficient information is encoded as auxiliary information. However, the present invention is not limited to this, and scale factors, quantized values, spectrum sign information, noise generation methods, etc. May be encoded. Two or more of these may be combined and encoded.
Moreover, although the spectrum data on the low band side is copied as the spectrum data on the high band section side, the present invention is not limited to this, and the spectrum data on the high band section side may be generated only from the second encoded information. .
[0102]
FIG. 14 is a spectrum waveform diagram showing a specific example of the auxiliary information (sign information) generated by the second quantization unit 133 shown in FIG. FIG. 15 is a flowchart showing an operation in the auxiliary information (sign information) calculation process of the second quantization unit 133 shown in FIG.
[0103]
The second quantizing unit 133 has a predetermined position of each scale factor band, for example, at the center of the scale factor band, for all the scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz. The signature information of the spectrum data is specified according to the following procedure (S41).
[0104]
The second quantizing unit 133 examines the sign information of the spectrum data at the center position of the first scale factor band in the high frequency region exceeding the reproduction band of 11.0525 kHz (S42), and holds the value. For example, the sign code of the spectrum data at the center position of the first scale factor band is “+”. The second quantizing unit 133 represents this code “+” as a 1-bit value “1” and holds it. If this sign is “−”, it is represented by “0” and held.
[0105]
When the sign information of the spectrum data at the center position of the scale factor band is held for the first scale factor band (S43), the second quantizing unit 133 sets the spectrum data at the center position for the next scale factor band. The sign is checked (S42). For example, if the checked code is “+”, the second quantization unit 133 holds “1” as the sign information of the spectrum data at the center position of the second scale factor band.
[0106]
Similarly, the second quantization unit 133 checks the sign “+” of the spectrum data at the center position of the third scale factor band in the high frequency part, holds the sign information “1”, and stores the sign information “1”. The sign “+” of the spectral data at the center position of the scale factor band is checked, and the sign information “1” is retained.
[0107]
In this way, when the sign information of the spectrum data at the center position is held for all the scale factor bands in the high frequency part (S43), the second quantizing unit 133 holds each scale factor band held. Is output to the second encoding unit 134 as auxiliary information for the high frequency band, and the process is terminated.
[0108]
As described above, auxiliary information (sign information) is generated by the second quantizing unit 133. This auxiliary information (sign information) is obtained by converting the high-frequency part represented by 512 points of spectrum data into 4 Each scale factor band is represented by 1-bit sine information, and the spectrum of the high frequency band can be represented with a very short data length.
[0109]
In this case, in the decoding device 200, the second inverse quantization unit 224 copies part or all of the spectrum data for 512 samples in the low band part as the high band side spectrum, and the second decoding unit The sign of spectrum data at a predetermined position is determined according to the sign information input from H.223.
[0110]
Here, the sign information representing the sign of the center position of each scale factor band in the high frequency region is used as auxiliary information (sign information), but is not limited to the position of the center of the scale factor band. It may be sign information, sign information at the head of the scale factor band, or a predetermined position other than that.
[0111]
Here, the position of the spectrum data corresponding to the code to be transmitted (signature information) is predetermined, but this may be changed according to the output of the first inverse quantization unit 222. The position information indicating the position of the sign information of each scale factor band may be added to the second encoded information and transmitted.
[0112]
The second inverse quantization unit 224 adjusts the amplitude of the copied spectral data as necessary. The adjustment of the amplitude can be achieved by multiplying each spectrum data by a predetermined coefficient, for example, a value of “0.5” and multiplying the coefficient.
This coefficient may be a fixed value, may be changed for each band or for each scale factor band, or may be changed according to the spectrum data output from the first inverse quantization unit 222. The amplitude adjustment method is not limited to this, and other methods may be used.
[0113]
Although a predetermined coefficient is used here, the value of this coefficient may be added to the second encoded information as auxiliary information. Further, a scale factor value may be added to the second encoded information as the coefficient, or a quantized value may be added to the second encoded information as a coefficient.
[0114]
Furthermore, here, only sign information, only sign information and coefficient information, only sign information and position information, or only sign information, position information, and coefficient information are encoded as auxiliary information. Instead, the quantization value, scale factor, characteristic spectrum position information, noise generation method, and the like may be encoded. Two or more of these may be combined and encoded.
[0115]
In this embodiment, the spectrum data on the low frequency band side is copied as the spectrum data on the high frequency band side. However, the present invention is not limited to this, and the spectrum data on the high frequency band side is only the second encoded information. You may generate from.
In the above description, the code “+” is represented by a 1-bit value “1”, and the code “−” is represented by “0”. It is not limited and may be expressed by other values.
[0116]
FIG. 16 is a spectrum waveform diagram showing an example of a method for creating auxiliary information (copy information) generated by the second quantization unit 133 shown in FIG. FIG. 16A is a waveform diagram showing a spectrum in the first scale factor band in the high frequency region. FIG. 16B is a waveform diagram showing an example of the spectrum waveform of the low frequency region specified by the auxiliary information (copy information). FIG. 17 is a flowchart showing an operation in the auxiliary information (copy information) calculation process of the second quantization unit 133 shown in FIG.
[0117]
The second quantizing unit 133, for all the scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz, is the peak position n from the head of the scale factor band (nth from the beginning). ), The scale factor band number N at which the peak position from the head of the scale factor band is the closest to n in the low frequency region is specified according to the following procedure (S51).
[0118]
The second quantization unit 133 specifies the position n of the absolute maximum spectrum data (peak) in the first scale factor band in the high frequency region exceeding the reproduction band of 11.0525 kHz (S52). As a result, for example, as shown in FIG. 16A, it is assumed that the specified peak position is {circle around (1)} and the spectrum is spectrum data of n = 22 in this scale factor band.
[0119]
The second quantizing unit 133 specifies the positions of all the peaks (including both positive and negative) in the low frequency part where the frequency of the spectrum is not more than the reproduction band of 11.0525 kHz (S53).
Next, the second quantization unit 133 searches for all the peaks specified in the low frequency part for a scale factor band whose position from the peak to the beginning of the scale factor band is closest to n, and for the scale factor band The number N, the search direction, and the peak sign information are specified (S54).
[0120]
Specifically, the second quantizing unit 133 sequentially selects all of the specified peaks (including both positive and negative) from the peak on the low frequency side in the scale factor band whose position from the peak is closest to n. Search for the beginning. There are two search directions: a case where the search is further performed from the peak toward the lower frequency direction (1), and a case where the search is performed further from the peak toward the higher frequency direction (2). In addition, the search for the high frequency peak and the low frequency peak in which the positive / negative sign is reversed is also performed when searching from the peak toward the lower frequency (3), and from the peak toward the higher frequency. There are two ways of searching (4).
[0121]
Among these, when the search directions are (2) and (4), when the spectrum waveform of the low frequency region is copied based on this peak information, the high frequency region is as shown in FIG. Since the waveform in which the position of the peak and the position of the low-frequency peak are reversed left and right (frequency axis direction) in the scale factor band is copied, for example, the search directions of (1) and (3) are forward And (2) and (4) as reverse directions, it is necessary to attach information indicating the forward and reverse search directions. When the search directions are (3) and (4), the peak position of the high band and the peak position of the low band are vertically moved (vertical axis direction) as shown in FIG. Since the inverted waveform is copied, it is necessary to attach information indicating whether or not the positive and negative signs of the high band peak value and the low band peak value are reversed.
[0122]
If the peak specified in the low frequency region has a positive value, the second quantization unit 133 is negative in the search direction of (1) and (2). If the peak has the value of (3) and (4), a search is performed in four directions, and among the search results, the number of the scale factor band whose position from the peak is closest to n is specified. To do. In this case, an error range with n is set to a predetermined value, for example, “5” in advance, and a scale factor band whose position from the peak is closest to n is selected from the four search results. The number N of the scale factor band is specified. At the same time, sign information indicating whether or not the positive and negative signs of the peak value in the high frequency region and the peak value in the low frequency region are inverted, and information indicating the reverse of the search direction are specified.
[0123]
For example, in the search direction (1), the scale factor band number N = with an error “1” of the position from the peak corresponding to the low-frequency spectrum as shown in (1) of FIG. Assume that 3 is specified. Further, in the search direction (2), the scale factor band number N = with an error “5” of the position from the peak corresponding to the spectrum in the low band as shown in (2) of FIG. 18 is specified, and similarly, in the search direction (3), an error “4” corresponding to the spectrum of the low frequency region as shown in (3) of FIG. In the search direction (4) with the number N = 12, the scale factor band number N = 10 has an error of “2” corresponding to the low-frequency spectrum as shown in (4) of FIG. Suppose that it was identified. The second quantizing unit 133 has the number N of the scale factor bands N = 3, the error of the position from the peak being “1” among the four specified scale factor bands, and the position from the peak closest to n. Select. At the same time, sign information “1” indicating the sign “+” of the peak in the low frequency region and search direction information “1” indicating that the search is further performed from the peak toward the low frequency direction are generated. In this case, if the sign of the peak is “−”, the sign information is “0", And the search direction information is expressed as" 0 "when searching from the peak toward the higher frequency direction.
[0124]
When the scale factor band number N = 3, the sign information “1”, and the search direction information “1” are specified for the first scale factor band in the high frequency range (S55), the second quantization unit 133 In the same manner as described above, the scale factor band number N, its sign information, and its search direction information are specified for the next scale factor band.
[0125]
In this way, for all scale factor bands in the high frequency band, the low frequency where the peak position from the head of the scale factor band is closest to n with respect to the peak position n from the head of the scale factor band. When the number N of the scale factor band of the part, the sign information thereof, and the search direction information thereof are specified (S55), the second quantizing part 133 corresponds to each specified scale factor band of the high frequency part. The low-band scale factor band number N, sign information, and search direction information are output to the second encoder 134 as high-band auxiliary information (copy information), and the process is terminated.
[0126]
In this case, when the decoding apparatus 200 decodes the first encoded signal according to the conventional procedure, 512-sample spectrum data on the low frequency side is obtained. The second inverse quantization unit 224 copies a part or all of the spectrum data corresponding to the scale factor band number output from the second decoding unit 223 as a high frequency band side spectrum. The second inverse quantization unit 224 adjusts the amplitude of the copied spectral data as necessary. The amplitude can be adjusted by multiplying each spectrum by a predetermined coefficient, for example, by setting the value to “0.5” and multiplying the coefficient.
[0127]
This coefficient may be a fixed value, may be changed for each band, for each scale factor band, or may be changed according to the spectrum data output from the first inverse quantization unit 222.
[0128]
Here, a predetermined coefficient is used for adjusting the amplitude, but the value of this coefficient may be added to the second encoded information as auxiliary information. Further, a scale factor value may be added to the second encoded information as a coefficient, or a quantized value may be added to the second encoded information as a coefficient. The amplitude adjustment method is not limited to this, and other methods may be used.
[0129]
In this case, the sign information and the search direction information are extracted in addition to the scale factor band number N as auxiliary information (copy information) of the high frequency band, but depending on the amount of information that can be transmitted for the high frequency band. The sign information and the search direction information may be omitted. The sign information is “1” if the sign of the low-frequency peak is “+”, and “0” if it is “−”, and the search direction information is further from the peak toward the lower frequency. The search is indicated as “1”, and the search from the peak toward the higher frequency direction is indicated as “0”. However, the sign of the low-frequency peak in the sign information and how to express the search direction of the search direction information are as follows. These are not limited to these, and may be expressed by other values.
[0130]
Further, here, the head of the scale factor band whose distance is closest to n from the position of each peak specified in the low frequency region is searched, but the present invention is not limited to this example, and the low frequency region A peak whose distance is closest to n from the head of each scale factor band may be searched.
[0131]
FIG. 18 is a spectrum waveform diagram showing a second example of a method for creating auxiliary information (copy information) generated by the second quantization unit 133 shown in FIG. FIG. 19 is a flowchart showing an operation in the second calculation process of auxiliary information (copy information) of the second quantization unit 133 shown in FIG.
[0132]
The second quantizing unit 133 performs spectral difference (energy difference) with respect to all spectrums within the scale factor band for all scale factor bands in the high frequency range up to the reproduction band 22.05 kHz exceeding the reproduction band 11.0525 kHz. The number N of the low-frequency scale factor band that minimizes () is specified according to the following procedure (S61). However, the number of spectra taking the difference from the high frequency region in the low frequency region is equal to the number of spectra in the scale factor band of the high frequency region, and the specified scale factor band number N is the head of the spectrum. The scale factor band number.
[0133]
The second quantizing unit 133 sets the frequency composed of the same number of spectrum data as the spectrum data in the high-frequency scale factor band from the beginning of the scale-factor band for all the low-frequency scale factor bands (S62). A difference between the spectrum of the high frequency region and the spectrum of the low frequency region is obtained by the width (S63). For example, in the waveform diagram shown in FIG. 18, if the first scale factor band in the high frequency band is a scale factor band with the number of spectral data = 48, the second quantization unit 133 uses the number N of the low frequency band. For 48 spectral data from the head of the scale factor band of = 1, the difference in spectrum between the high frequency region and the low frequency region is obtained sequentially.
[0134]
When the second quantizing unit 133 obtains the difference in spectrum between the high frequency region and the low frequency region for the same number of spectrums as the scale factor band of the high frequency region (S65), the second quantization unit 133 holds the value and A difference between the high-frequency spectrum and the low-frequency spectrum is obtained from the beginning of the low-frequency scale factor band with the same frequency width of the spectrum data as the spectrum in the high-frequency scale factor band (S64). For example, when the difference of the spectrum is obtained with the width of 48 spectrum data from the beginning of the scale factor band of the low band number N = 1, the value of the obtained difference is held, A spectrum difference is obtained with a width of 48 spectrum data from the head of the scale factor band of number N = 2. Similarly, the low-frequency part N = 3 scale factor band, the number N = 4 scale factor band,..., The number N = 28 (the last low-frequency part) scale factor band, and so on. For all the scale factor bands in the region, the difference between the 48 spectral data in the high region and the low region is sequentially added to obtain the spectrum difference.
[0135]
For all scale factor bands in the low frequency band, the width of the high frequency spectrum and the low frequency spectrum is the same as the spectrum data in the high frequency scale factor band from the beginning of the scale factor band. When the difference is obtained (S64), the second quantization unit 133 specifies the number N of the scale factor band that minimizes the obtained difference (S65). For example, in the spectrum waveform diagram shown in FIG. This indicates that the spectrum of the portion indicated by the oblique line in the low frequency part has the smallest difference from the spectrum of the part indicated by the oblique line in the high frequency part, and the energy difference between the spectra is the smallest. That is, 48 spectrum data from the head of the scale factor band of number N = 8 are11.025When copied to the first scale factor band in the high band starting from kHz, the waveform shown by the alternate long and short dash line in the high band of FIG. 19 is obtained, and is approximately within the scale factor band of the high band with respect to the original spectrum. The energy of can be expressed.
[0136]
When the second quantization unit 133 specifies the number N of the low-frequency scale factor band that minimizes the difference in spectrum for the high-frequency scale factor band, the second quantization unit 133 holds the specified scale factor band number N. In the same manner as described above, the number N of the corresponding scale factor band is specified for the scale factor band of the next high frequency part (S66). Hereinafter, this process is sequentially repeated for each scale factor band of the high frequency part, and when the number N of the low frequency scale factor band having the smallest spectrum difference is specified in all the scale factor bands of the high frequency part, The stored low band scale factor band number N is output to the second encoder 134 as high band auxiliary information (copy information), and the process ends.
[0137]
In this case, the low-frequency side spectrum copy method and amplitude adjustment method in decoding apparatus 200 are the same as those in the case of auxiliary information (copy information) described with reference to FIGS.
[0138]
Further, in the flowchart of FIG. 19, when calculating the energy difference between the high frequency region and the low frequency region, the calculation is performed in the same direction and in the same direction on the frequency axis, but the encoding device of the present invention is limited to this. Instead, as described with reference to FIGS. 16 and 17, the energy difference between the high frequency region and the low frequency region may be calculated using any of the following three methods. (1) The low-frequency scale for the high-frequency spectrum data that is selected from the low-frequency side to the high-frequency side, with the same spectral data values for the high-frequency part and the low-frequency part. Spectral data is sequentially selected from the high frequency side to the low frequency side (that is, in the reverse direction on the frequency axis) for the same number of spectral data as the high frequency region from the beginning of the factor band, and the difference is calculated. (2) The sign of the low frequency spectrum is inverted (minus) and calculated in the same direction on the frequency axis. (3) The sign of the low frequency spectrum is inverted (minus) and the calculation is performed in the reverse direction on the frequency axis. Further, after performing the calculation by all these four methods, the number N of the scale factor band of the low-frequency spectrum that minimizes the energy difference among these may be used as auxiliary information. In this case, in order to correctly copy the low band spectrum with the smallest energy difference to the high band, information indicating the sign relationship between the low band spectrum and the high band spectrum and the low band spectrum Information indicating the direction on the frequency axis for copying the region spectrum is included in the auxiliary information for each scale factor band. The information indicating the relationship between the codes of the low-frequency spectrum and the high-frequency spectrum is, for example, “1” when the difference is obtained with the same code and “0” when the difference is obtained with the opposite code. expressed. The information indicating the direction on the frequency axis when copying the low-frequency spectrum to the high-frequency section is, for example, selecting spectral data in the high-frequency area and the low-frequency area when copying in the forward direction. When the direction is forward, “1”, when copying in the reverse direction, that is, when the direction in which spectrum data is selected in the high-frequency part and the low-frequency part is the reverse direction, “0” is set to 1 bit. expressed.
[0139]
FIG. 20 is a flowchart illustrating a procedure in which the low frequency band 512 spectrum is copied to the high frequency band in the forward direction by the second inverse quantization unit 224 illustrated in FIG. In FIG. 20, inv_spec1 [i] indicates the value of the i-th spectrum in the output data of the first inverse quantization unit 222, and inv_spec2 [j] is the input data of the second inverse quantization unit 224. Among these, the value of the j-th spectrum is shown.
[0140]
First, since the second inverse quantization unit 224 inputs the 0th spectrum to the 511th spectrum in the same direction, the initial values of the counters i and j that count the number of spectra are set to “0”, respectively. (S71). Next, the second inverse quantization unit 224 checks whether or not the value of the counter i is less than “512” (S72), and if the value of the counter i is less than “512”, the first inverse quantum The value of the i-th (0th in this case) spectrum of the quantization unit 222 is input as the value of the j-th (0th in this case) spectrum of the second inverse quantization unit 224. (S73). Thereafter, the second inverse quantization unit 224 increments the values of the counters i and j by “1” (S74), and checks whether the value of the counter i is less than “512” (S72). .
[0141]
The second inverse quantization unit 224 repeats the above process while the value of the counter i is less than “512”, and ends the process when the value of the counter i becomes “512” or more. As a result, the entire spectrum of the 0 to 511th low-frequency part, which is the inverse quantization result of the first inverse quantization unit 222, is copied as it is as the spectrum of the high-frequency part of the second inverse quantization unit 224. .
[0142]
FIG. 21 is a flowchart illustrating a procedure in which the low frequency band 512 spectrum is copied to the high frequency band in the reverse direction of the frequency axis direction by the second inverse quantization unit 224 illustrated in FIG. As in FIG. 20, inv_spec1 [i] indicates the value of the i-th spectrum in the output data of the first inverse quantization unit 222, and inv_spec2 [j] is the second inverse quantization in FIG. The value of the j-th spectrum in the input data of the unit 224 is shown.
[0143]
First, since the second inverse quantization unit 224 inputs from the 0th spectrum to the 511th spectrum in the reverse direction, the initial value of the counter i for counting the number of spectra is set to “0” and the initial value of j The value is set to “511” (S81). Next, the second inverse quantization unit 224 checks whether or not the value of the counter i is less than “512” (S82), and if the value of the counter i is less than “512”, the first inverse quantum The value of the i-th (0th in this case) spectrum of the quantization unit 222 is input as the value of the j-th (511th in this case) spectrum of the second inverse quantization unit 224. (S83). Thereafter, the second inverse quantization unit 224 increments the value of the counter i by “1”, decrements the value of j by “1” (S84), and the value of the counter i is less than “512”. It is checked whether or not there is (S82).
[0144]
The second inverse quantization unit 224 repeats the above process while the value of the counter i is less than “512”, and ends the process when the value of the counter i becomes “512” or more. As a result, the entire spectrum of the 0 to 511th low frequency band, which is the dequantization result of the first dequantization unit 222, is the 511th to 0th spectrum of the high frequency band of the second dequantization unit 224. Is copied in the reverse direction.
[0145]
Here, the second inverse quantization unit 224 has copied all the spectral data in the low band part to the high band part, but may copy only a part thereof. Further, as a procedure for copying the entire high frequency band and low frequency band at a time, the case of FIG. 20 and FIG. 21 is given as an example. However, a part of the copy is as shown in FIG. May be. Furthermore, part or all of them may be copied with the positive and negative signs reversed.
[0146]
These copying procedures may be determined in advance, may be changed according to the data in the low frequency region, or may be transmitted as auxiliary information.
Here, the spectrum data on the low frequency band side is copied as the spectrum data on the high frequency band side, but not limited to this, the spectrum data on the high frequency band side is generated only from the second encoded information. Also good.
[0147]
In the present embodiment, 512 samples on the low frequency side of the entire spectrum data are encoded as the first encoded signal and the remaining are encoded as the second encoded signal, but the distribution is limited to this. Is not to be done.
In the present embodiment, the case where the spectral data obtained mainly from the first inverse quantization unit 222 is copied as the noise generation in the second inverse quantization unit 224 has been described. However, the present invention is limited to this. In addition, spectral data having a constant value in each high-frequency scale factor band, white noise, pink noise, and the like may be independently generated by the second inverse quantization unit 224, or according to auxiliary information. It may be generated.
[0148]
In the present embodiment, one auxiliary information is encoded for each scale factor band as the second encoded signal. However, even if one auxiliary information is encoded for every two or more scale factor bands, Alternatively, two or more pieces of auxiliary information may be encoded in one scale factor band.
In addition, the auxiliary information in this Embodiment may encode auxiliary information for every channel, and may encode one auxiliary information with respect to two or more channels.
[0149]
In the present embodiment, there are two quantization units and two encoding units in encoding apparatus 100, but the present invention is not limited to this, and three or more quantization units and decoding units are provided. Also good.
In the present embodiment, there are two decoding units and two dequantization units in decoding apparatus 200, but the present invention is not limited to this, and three or more decoding units and dequantization units are included. You may prepare.
[0150]
In the present embodiment, the case where the conversion unit 120 classifies the converted spectrum data into uniquely defined division methods and numbers of scale factor bands has been described. However, the encoding device of the present invention is not limited to this. However, the conversion unit may classify the converted spectrum data into scale factor bands according to the MPEG-2 AAC standard. As described above, by classifying into the scale factor bands according to the standard, the conventional decoding apparatus 400 can also decode the bitstream encoded by the encoding apparatus 100 of the present invention without any trouble and perform the same as before. Digital sound output data can be obtained.
[0151]
The above processing can be realized not only by hardware but also by software, and a configuration in which one part is realized by hardware and the rest is realized by software.
In the present embodiment, the sampling frequency is 44.1 kHz and one frame is described as digital audio data of 1024 samples. However, the encoding device and decoding device of the present invention are not limited to this, and the sampling frequency is not limited thereto. May be any Hz.
[0152]
The encoding device of the present invention is an encoding device that encodes an input acoustic signal, and is spectral data divided into a plurality of groups obtained by converting the input acoustic signal for a predetermined time.From,A normalization coefficient for normalizing the spectrum data in each group, a quantization value obtained by quantizing each spectrum data in each group using the normalization coefficient, and positive / negative of each spectrum data It is represented by four types of information including a positive or negative sign to represent and a position on the frequency axis of each spectrum data.Low frequency rangedataThe first encoding means for encoding the frequency high frequency sectionAs information for identifying low-frequency spectrum data approximated to the spectrum data in each group of the above and information for shaping the identified low-frequency spectrum data, high-frequency spectrum dataFeatures of, Including information for shaping expressed by one or more types of information among the four types of informationAuxiliary information generating means for generating auxiliary information, second encoding means for encoding the generated auxiliary information, data encoded by the first encoding means, and encoding by the second encoding means Output means for outputting the converted data. In the above encoding apparatus of the present invention, the auxiliary information generating means has the characteristics of the high frequency part of the frequency among the spectrum data divided into a plurality of groups obtained by converting the input acoustic signal for a predetermined time., Expressed with less information than the low rangeAuxiliary information is generated, and the second encoding means encodes the generated auxiliary information.
[0153]
Therefore, according to the encoding device of the present invention, the high-frequency part spectral data is not directly quantized and encoded, but the high-frequency part of the frequency is characterized., Expressed with fewer parameters than the low frequency rangeSince the auxiliary information is encoded,Very low compared to the low rangeThere is an effect that the spectrum in the high frequency part of the frequency can be encoded with a small amount of data. Also, in the conventional MPEG-2 AAC, the encoding of the acoustic signal of the entire band is performed.In low and high frequenciesSince it was performed in the same manner, it was difficult to transmit the high frequency band at a low transfer rate. However, according to the encoding device of the present invention, the high frequency band is not increased without significantly increasing the amount of information after encoding. Therefore, a decoding device that decodes the information can decode a high-quality sound signal that is richer in the high frequency region than a conventional decoding device.
[0154]
Further, in the encoding device according to the present invention, the auxiliary information generation unit quantizes the spectrum data that becomes a peak in each group of the high frequency part for the spectrum data divided into a plurality of groups.When thatThe normalization factor calculated so that the value becomes a constant value isInformation for shapingGenerated asYou may do that.
In addition, the auxiliary information generation means, for the spectrum data divided into a plurality of groups,SaidSpectral data that peaks in each group is quantized using a normalization coefficient common to each group, and the quantized value isInformation for shapingGenerated asMay do.
[0155]
Therefore, according to the encoding device of the present invention, each group of the high frequency part(Scale factor band)InluckSince each quantization coefficient or peak quantized value of spectrum data is generated as auxiliary information, it is assumed that a certain number of bits, for example, 8 bits are assigned to represent one normalization coefficient or quantized value. However, the amount of auxiliary information is smallThe Therefore,Small amount of data for each high frequency group,Rough spectrum datamaximumAmplitude can be represented. Thereby, according to the encoding device of the present invention, even in a transmission path with a low transfer rate, with a slight increase in transmission amount compared to the conventional,With the characteristics of the original soundHigh regionInformation for generating acoustic signals.Since the data can be transmitted, the decoding apparatus that decodes the data has an effect of restoring an acoustic signal that is more faithful to the original sound.
[0156]
Further, in the encoding device of the present invention, the auxiliary information generating means is configured to perform a high-frequency portion for the spectrum data divided into a plurality of groups.Belongs toIn each group, the frequency position of the peak spectral data isInformation for shapingGenerated asYou may do that.
Also,The spectrum data is an MDCT coefficient, and the auxiliary information generating means, for the spectrum data divided into a plurality of groups, adds a code indicating whether the spectrum data is positive or negative at a predetermined frequency position in a high frequency region.Information for shapingGenerated asYou may do that.
[0157]
Therefore, according to the encoding apparatus of the present invention, the frequency position of the peak spectral data, or the predetermined high-frequency portion.ofBy the sign of the spectrum data at the frequency position, it is possible to represent the rough spectrum shape in each high frequency group (scale factor band) with a small amount of data.Therefore, the copied spectral data can be shaped so that it can be approximated more accurately by the high-frequency spectrum.There is an effect.
[0158]
Further, in the encoding device of the present invention, the auxiliary information generating means is configured to obtain, for each spectrum of the spectrum data divided into a plurality of groups, a low-frequency portion that most closely approximates the spectrum in the high-frequency portion. Information identifying the spectrumInformation that identifies low-frequency spectrum dataIt may be generated as
[0159]
Therefore, according to the encoding device of the present invention, when a spectrum having a shape very similar to the high-frequency spectrum is found in the low-frequency spectrum, the low-frequency spectrum is specified.And copy to high frequencyJust doSoThere is an effect that the high-frequency spectrum can be expressed more faithfully with a very small amount of data.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration of an encoding device and a decoding device according to an embodiment of the present invention.
FIG. 2 is a block diagram showing a configuration of an encoding device and a decoding device, which are other configuration examples of the present embodiment.
FIG. 3 is a diagram showing a change in state of an acoustic signal processed in the encoding device shown in FIG. 1;
4 is a diagram illustrating a position in a bit stream in which auxiliary information is stored by the stream output unit illustrated in FIG. 1. FIG.
FIG. 5 is a diagram illustrating another example when the stream output unit illustrated in FIG. 1 stores auxiliary information.
6 is a flowchart showing an operation in the scale factor determination process of the first quantization unit shown in FIG. 1; FIG.
7 is a flowchart showing an operation in another scale factor determination process of the first quantizing unit shown in FIG. 1; FIG.
FIG. 8 is a spectrum waveform diagram showing a specific example of auxiliary information (scale factor) generated by the second quantization unit shown in FIG. 1;
FIG. 9 is a flowchart showing an operation in auxiliary information (scale factor) calculation processing of the second quantization unit shown in FIG. 1;
10 is a spectrum waveform diagram showing a specific example of auxiliary information (quantized value) generated by the second quantization unit shown in FIG. 1; FIG.
FIG. 11 is a flowchart illustrating an operation in auxiliary information (quantized value) calculation processing of the second quantization unit illustrated in FIG. 1;
12 is a spectrum waveform diagram showing a specific example of auxiliary information (position information) generated by the second quantization unit shown in FIG. 1; FIG.
FIG. 13 is a flowchart showing an operation in auxiliary information (position information) calculation processing of the second quantization unit shown in FIG. 1;
14 is a spectrum waveform diagram showing a specific example of auxiliary information (sign information) generated by the second quantization unit shown in FIG. 1; FIG.
FIG. 15 is a flowchart showing an operation in auxiliary information (sign information) calculation processing of the second quantization unit shown in FIG. 1;
16 is a spectrum waveform diagram showing an example of a method for creating auxiliary information (copy information) generated by the second quantization unit shown in FIG. 1; FIG.
FIG. 17 is a flowchart showing an operation in auxiliary information (copy information) calculation processing of the second quantization unit shown in FIG. 1;
18 is a spectrum waveform diagram showing a second example of a method for creating auxiliary information (copy information) generated by the second quantizing unit shown in FIG. 1; FIG.
FIG. 19 is a flowchart showing an operation in a second calculation process of auxiliary information (copy information) of the second quantization unit shown in FIG. 1;
20 is a flowchart showing a procedure for copying a low band 512 spectrum in the forward direction to the high band by the second inverse quantization section shown in FIG. 1; FIG.
FIG. 21 is a flowchart illustrating a procedure in which a low-frequency part 512 spectrum is copied to a high-frequency part in the reverse direction of the frequency axis direction by the second inverse quantization unit shown in FIG. 1;
FIG. 22 is a block diagram showing a configuration of a conventional MPEG-2 AAC encoding apparatus and decoding apparatus.
[Explanation of symbols]
100 Encoder
110 Acoustic signal input unit
120 Conversion unit
131 1st quantization part
132 1st encoding part
133 2nd quantization part
134 Second encoding unit
140 Stream output unit
200 Decryption device
210 Stream input section
221 First decoding unit
222 First inverse quantization unit
223 Second decoding unit
224 Second inverse quantization unit
225 Inverse quantization data composition unit
230 Inverse conversion unit
240 Acoustic signal output unit
152 Inverse quantization unit

Claims

An encoding device that encodes an input acoustic signal,
A conversion means for converting an input acoustic signal for a predetermined time into spectrum data on a frequency axis, and dividing the spectrum data into a plurality of groups;
For lower frequency band of the spectral data divided into the plurality of groups, the normalized coefficient for normalizing the spectral data in each group in the low-frequency band, within said each group using the normalizing factor Low-frequency data including four types of information including a quantized value obtained by quantizing each spectrum data, information indicating the sign of each spectrum data, and a position on the frequency axis of each spectrum data First encoding means for encoding
For the high frequency part of the spectrum data divided into the plurality of groups, information for specifying a group of low frequency parts approximate to the spectral data in each group included in the high frequency part, and the specified low frequency part as information for shaping the spectral data of the group, the characteristics of the high frequency Bude over data, among the four kinds of information, including the information for shaping expressed in one or more three following information Auxiliary information generating means for generating auxiliary information;
Second encoding means for encoding the generated auxiliary information;
Output means for outputting the data encoded by the first encoding means and the data encoded by the second encoding means ;
The auxiliary information generation means is calculated so that when the spectrum data that becomes a peak is quantized in each group included in the high frequency part of the spectrum data divided into a plurality of groups, the value becomes a constant value. An encoding apparatus that generates the normalization coefficient as information for the shaping .

An encoding device that encodes an input acoustic signal,
  A conversion means for converting an input acoustic signal for a predetermined time into spectrum data on a frequency axis, and dividing the spectrum data into a plurality of groups;
  A first encoding means for encoding a low frequency part of the spectrum data divided into the plurality of groups;
  Of the spectral data obtained by converting the input acoustic signal for a certain period of time, auxiliary information generating means for generating auxiliary information representing the characteristics of the high frequency part of the frequency,
  Second encoding means for encoding the generated auxiliary information;
  Output means for outputting the data encoded by the first encoding means and the data encoded by the second encoding means;
  With
  The auxiliary information generation means uses, as the auxiliary information, information specifying a spectrum of a low frequency portion that is most approximate to a spectrum in the group in each group of the high frequency region for the spectrum data divided into a plurality of groups. An encoding device characterized by generating.

The auxiliary information generating means quantizes the spectrum data divided into a plurality of groups by using a normalization coefficient common to each group, the spectrum data having a peak in each group in the high frequency band, The encoding apparatus according to claim 1 or 2 , wherein a quantization value is generated as information for the shaping.

The auxiliary information generating means generates, as the information for shaping, the frequency position of the spectrum data that becomes a peak in each group belonging to the high frequency part for the spectrum data divided into a plurality of groups. The encoding apparatus according to claim 1 or 2 .

The spectrum data is an MDCT coefficient, and the auxiliary information generation means uses the spectrum data divided into a plurality of groups to indicate a sign indicating whether the spectrum data is positive or negative at a predetermined frequency position in the high frequency band for the shaping. The encoding device according to claim 1 or 2 , wherein the encoding device is generated as information.

The auxiliary information generating means, for each of the spectrum data divided into a plurality of groups, in each group of the high frequency portion, the information specifying the spectrum of the low frequency portion that most closely approximates the spectrum in the group, The encoding device according to claim 1, wherein the partial spectrum data is generated as information specifying the partial spectrum data.

The auxiliary information generating means includes the distance on the frequency axis from the division of the group belonging to the high-frequency part to the peak of the spectrum in the high-frequency part group and the division of the group belonging to the low-frequency part. The encoding device according to claim 1 or 2, wherein information for specifying a low-frequency spectrum in which a difference in distance from a distance on a frequency axis to a spectrum peak in a partial group is minimum is generated. .

The auxiliary information generating means generates information for specifying a low-frequency part spectrum having a minimum difference value when the energy difference is taken with the same frequency width as the spectrum in the group belonging to the high-frequency part. encoding apparatus according to claim 1 or claim 2 wherein the.

The encoding device according to claim 8, wherein the information specifying the low-frequency spectrum data is represented by a number specifying the group of the specified low-frequency spectrum.

The output means further includes:
The data encoded by the first encoding means is converted into an encoded audio stream defined in a predetermined format, and is an area within the encoded audio stream, the use of which is restricted by the encoding protocol. if not region encoding apparatus according to claim 1 or claim 2 wherein, characterized in that it comprises a stream output unit for outputting the stored coded data by said second encoding means.

The output means further includes:
The data encoded by the first encoding unit is converted into an encoded audio stream defined in a predetermined format, and the data encoded by the second encoding unit is referred to as the encoded audio stream. encoding apparatus according to claim 1 or claim 2, wherein further comprising a second stream output unit for outputting stored in different streams.

A program for an encoding device for encoding an input acoustic signal, comprising:
  A conversion means for converting an input acoustic signal for a predetermined time into spectrum data on a frequency axis, and dividing the spectrum data into a plurality of groups;
  A normalization coefficient that normalizes spectral data in each group included in the low-frequency part for the low-frequency part of the spectral data divided into the plurality of groups, and each group using the normalization coefficient Low-frequency data including four types of information including a quantized value obtained by quantizing each spectrum data, information indicating the sign of each spectrum data, and a position on the frequency axis of each spectrum data First encoding means for encoding
  For the high frequency part of the spectrum data divided into the plurality of groups, information for specifying a group of low frequency parts approximate to the spectral data in each group included in the high frequency part, and the specified low frequency part Auxiliary information including information for shaping the characteristics of the high frequency band data as information for shaping the spectrum data of the group of the above-described four types of information by information of one or more and three or less types Auxiliary information generating means for generating
  Second encoding means for encoding the generated auxiliary information;
  Function as output means for outputting the data encoded by the first encoding means and the data encoded by the second encoding means,
  The auxiliary information generating means, for the spectral data divided into a plurality of groups, when quantizing the spectral data that peaks in each group of the high frequency portion, the value was calculated to be a constant value, A program that causes the normalization coefficient to function as information for shaping.

A program for the encoding device includes a computer,
The auxiliary information generating means quantizes the peak value of the spectrum data in each group of the high frequency band using the normalization coefficient common to each group for the spectrum data divided into a plurality of groups, The program according to claim 12, which functions to generate a quantized value as information for the shaping.

A program for the encoding device includes a computer,
The auxiliary information generating means functions to generate, as information for shaping, the frequency position of the peak spectrum data in each of the groups belonging to the high frequency part, with respect to the spectrum data divided into a plurality of groups. The program according to claim 12.

A program for the encoding device includes a computer,
The spectrum data is an MDCT coefficient, and the auxiliary information generating means uses a sign indicating the sign of spectrum data at a predetermined frequency position in a high frequency region for the shaping for the spectrum data divided into a plurality of groups. The program according to claim 12, which functions to be generated as information.

A program for the encoding device includes a computer,
For the spectrum data divided into a plurality of groups, the auxiliary information generating means, in each group of the high frequency part, information specifying the low frequency part spectrum that most closely approximates the spectrum in the group, the low frequency part The program according to claim 12, which functions to generate spectrum data as information for specifying.