JP2004198559A

JP2004198559A - Encoding method and decoding method for time-series signal

Info

Publication number: JP2004198559A
Application number: JP2002364526A
Authority: JP
Inventors: Toshio Motegi; 敏雄茂出木
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2002-12-17
Filing date: 2002-12-17
Publication date: 2004-07-15

Abstract

<P>PROBLEM TO BE SOLVED: To provide an encoding method and a decoding method for a reversibly compressed time-series signal which sufficiently compress the time-series signal wherein patterns of similar amplitudes repeatedly appear. <P>SOLUTION: When the time-series signal consisting of sample sequences of time series is constituted of a plurality of channels, differences between channels are calculated, and operation results of parts having small differences are separated as inter-channel difference data (S1). Then a plurality of frames having a prescribed length are set in the sample sequence of each channel, and differences between the frames are calculated, and frames having small signal differences are separated as inter-frame difference data (S2). Parts having small signal variance in each sample sequence are separated as signal flat part data (S3). Higher-order bits and lower-order bits of each sample sequence are separated (S4) after these data are separated, and the higher-order bit data are converted to a variable bit length data (S6) after a predictive error is calculated for higher-order bits (S5). <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【産業上の利用分野】
本発明は、音楽制作、音響データの素材保管、ロケ素材の中継など音楽制作分野、特にＣＤよりも品質の高い高精細オーディオ制作を行う分野、ＣＤ、ＤＶＤ等のデジタル記録媒体を用いたオーディオ記録再生分野、遠隔医療における生体信号の伝送等、データの改変が嫌われる分野等において好適なデータの可逆圧縮技術に関する。
【０００２】
【従来の技術】
従来より、音響信号の圧縮には様々な手法が用いられている。音響信号を圧縮して符号化する手法として、ＭＰ３（ＭＰＥＧ−１／Ｌａｙｅｒ３）、ＡＡＣ（ＭＰＥＧ−２／Ｌａｙｅｒ３）などが実用化されている。このような圧縮符号化方式により、音響信号を小さいデータとして扱うことが可能となり、データの記録・伝送の効率化に貢献している。
【０００３】
【発明が解決しようとする課題】
上述のようなＭＰ３、ＡＡＣ等はいずれもロッシー符号化方式といわれるものであり、効率的な圧縮が可能であるが、復号化にあたって、少なからず品質の劣化を伴い、原信号を完全に再現することはできない。そのため、音楽制作、素材保管、ロケ素材の中継など音楽制作分野では、これらの符号化方式を適用できず、非効率ではあるが、非圧縮で保存・伝送する方式がとられている。特に最近は高精細オーディオを扱うプロダクションが増え、素材容量が膨大になり、ワークディスクを管理する上で問題になってきている。
【０００４】
上記のような問題を解決するため、本出願人は、時系列信号のサンプル列に対してチャンネル間、フレーム間の差分演算を行って、各サンプルの値を小さくした後、予測符号化を利用してデータの圧縮を行う手法について提案している。（特許文献１参照）。
【０００５】
【特許文献１】
特願２００２−２３１１５０号
【０００６】
しかしながら、上記出願で提案した手法では、時系列信号がステレオ音響信号のように複数のチャンネルで構成されている場合、時系列の全区間に渡って差分処理を行うと、信号振幅が増大する箇所が発生し、その後に行う予測誤差を用いた符号化により、かえってデータ量が増えてしまうことがある。また、同一のチャンネルの異なる時刻におけるサンプルのまとまりであるフレーム間の相関がある場合には、フレーム同士の差分演算を行うことにより信号振幅が減少するが、予測不可能な雑音成分の割合が増大し、その後に行う予測誤差を用いた符号化により、かえってデータ量が増えてしまうことがある。これは、特に類似した振幅のパターン（いわゆる信号波形）が繰り返し生じるような信号に対して行う場合に生じ易い。また、信号レベルが同一の部分について信号平坦部として分離し、データ量を削減するようにしているが、信号レベルが同一の部分が連続する音響素材は、あまり多くないため、圧縮効果が少ない。
【０００７】
そこで、これらの問題を解決するため、本発明は、類似した振幅のパターンが繰り返し発生する時系列信号に対して十分な圧縮を行うことが可能であると共に、復号時には、元の時系列信号を完全に復号することが可能な可逆圧縮方式の時系列信号の符号化方法および復号方法を提供することを課題とする。
【０００８】
【課題を解決するための手段】
上記課題を解決するため、本発明では、時系列のサンプル列で構成される時系列信号に対して、前記全てのサンプル列を再現できるように情報量を圧縮する符号化方法として、前記サンプル列の中から所定の個数のサンプル列で構成されるフレームを複数個抽出し、抽出したフレーム間で相関演算を施し、フレーム間の相関が高い場合に一方のフレームのサンプル列を、各サンプルがより少ないビット数で表現されたフレーム間差分データとして、前記サンプル列から分離するフレーム間演算段階、前記分離されたフレーム間差分データ、およびフレーム間差分データの分離により残ったサンプル列を所定のフォーマットで記録する段階を実行するようにしたことを特徴とする。
【０００９】
本発明によれば、時系列のサンプル列に対して、サンプル列中の相関が高い箇所を差分データとして分離して、分離した部分を少ないビット数で表現することにより、類似した振幅のパターンが繰り返し発生する時系列信号に対して十分な圧縮を行うことが可能となる。
【００１０】
【発明の実施の形態】
以下、本発明の実施形態について図面を参照して詳細に説明する。
（データの構造）
まず、本発明に係る時系列信号の符号化方法において符号化対象とする時系列信号について説明する。本実施形態では、時系列信号として音響信号を適用した場合を例にとって説明していく。図１（ａ）は、本発明において扱う音響信号を模式化して示した図である。図１において、左右方向は時系列方向であり、右側に行く程、時間が進むことになる。すなわち左端が開始時刻であり、右端が終端時刻となっている。図１に示した音響信号は、Ｃｈ１とＣｈ２の２チャンネルのデータでＣｈ１にＬ（左）信号、Ｃｈ２にＲ（右）信号が記録されたものとなっている。
【００１１】
図１（ａ）に示したようなデジタル音響信号を得るには、まず、時系列信号であるアナログの音響信号をデジタル化する。これは、従来の一般的なＰＣＭの手法を用い、所定のサンプリング周波数でこのアナログ音響信号をサンプリングし、振幅を所定の量子化ビット数を用いてデジタルデータに変換する処理を行えば良い。本実施形態では、サンプリング周波数４４．１ＫＨｚ、量子化ビット数１６ビットで正負の符号を記録した場合を想定して以降説明する。サンプリング周波数４４．１ＫＨｚでサンプリングすると、１秒あたり４４１００個のサンプルにより構成されるサンプル列ができることになる。またここでは、音響信号が複数のチャンネルからなるので、各チャンネルごとにデジタル化が行われる。
【００１２】
（符号化方法）
続いて、本発明に係る時系列信号の符号化方法の概要について説明する。図２は、本発明に係る時系列信号の符号化方法の概要を示すフローチャートである。まず、図１に示したデジタル音響信号であるサンプル列に対して、チャンネル間の差分演算を行う（ステップＳ１）。具体的には、まず、チャンネルＣｈ１の全区間とチャンネルＣｈ２の全区間の、同一時刻におけるサンプルデータの差分演算を行う。その結果、差分が所定の閾値以下となるチャンネルＣｈ２上の区間の差分データを、チャンネルＣｈ２のサンプル列から分離し、チャンネル間差分データとして別途記録する。本実施形態では、所定の閾値として、下位４ビット以内を設定している。下位４ビット以内とは、正負の符号付で表現した場合、１０進数で−８〜＋７の値となる。差分演算の結果、サンプル列の値が−８〜＋７の値をとる区間については、チャンネル間差分データとして記録されることになる。なお、チャンネル間差分データ内において同一の値が複数サンプル連続する場合は、連続する部分の先頭のサンプル番号と、サンプル値、および連続するサンプル数を記録することによりデータ量をさらに削減する。この場合、連続するサンプル数に代えて最後尾のサンプル番号を記録するようにしても良い。
【００１３】
チャンネル間差分データが分離されたチャンネルＣｈ２は、分離された区間以降のサンプル列を前に詰めることにより、全体のサンプル数が減ることになる。例えば、差分値が閾値以下であるサンプルが図１（ｂ）に示す区間連続した場合、この区間が分離されることになり、チャンネルＣｈ２のサンプル列とチャンネル間差分データは、それぞれ図１（ｃ）に示すようになる。チャンネルＣｈ１のサンプル数（図中左右の長さで表現）は変化がないが、チャンネルＣｈ２のサンプル数は、分離された差分データ分だけ減少することになる。なお、図１の例では、チャンネル間差分データの分離された区間が１箇所だけであるが、現実には、多数分離されることになる。チャンネル間においてデータの圧縮を行う手法としては、全区間に渡って差分演算を行って、その差分値を一方のチャンネルのサンプル値として記録する手法もある。しかし、このような手法の場合、元の音響信号がヴォーカル音等左右均等に録音されたものであれば、圧縮効率が高いが、楽器音のように、どちらか一方のチャンネルを中心に録音されたものについては、信号振幅が大きくなり、後に行う予測符号化を用いた結果、かえってデータ量が大きくなってしまう可能性がある。そこで、本実施形態では、差分が小さい区間のみ、差分データとして分離する手法をとっている。
【００１４】
続いて、チャンネル間の差分演算処理が行われた各チャンネルのサンプル列に対して、所定の区間長をもつフレームを設定して、設定されたフレーム間の演算を行う（ステップＳ２）。まず、各フレームを構成するサンプル列の類似度を求め、類似しているフレームを選別する。本実施形態では、フレーム長をサンプル列の開始時刻から終了時刻までの全区間に渡って固定長としている。具体的には、１フレームを２５６サンプルとしている。チャンネルデータ（チャンネルを構成するサンプル列）の先頭から２５６サンプルずつを１フレームとして抽出し、各フレームの類似度を求めていくことになる。フレーム同士の類似度とは、両信号の相関を求めることになるので、相関計算を行うための種々の手法を用いることができるが、本実施形態では、各フレームにおける２５６サンプルのうち、他フレームにおける対応するサンプルとの差分値の絶対値の最大値を抽出し、最大値が所定値以内に収まるフレーム対を１つの類似フレームとして選別することになる。この処理はサンプル列の全区間に渡って行われる。ここで、フレーム間演算処理によるサンプル列の変化の様子を図３（ａ）〜（ｃ）に示す。なお、図３においては、図１と異なり１チャンネルしか示していないが、他のチャンネルについても同様に処理される。まず、図３（ａ）に示したように、固定長にフレーム化されたサンプル列は、フレームＡ１、Ａ２、Ａ３…に区分される。
【００１５】
続いて、各フレームについて、差分を算出する。本実施形態では、２５６個の差分値が各サンプル時刻に対して得られることになる。得られた差分値の絶対値の最大値が、所定値以内であれば、そのフレームの差分処理後のサンプル列を差分データとして、各チャンネルのサンプル列から分離して記録する。例えば、図３（ｂ）に示されるように、フレームＡ１とフレームＡ２に対して処理を行った場合、先行するフレームＡ１はそのままであるが、フレームＡ１とフレームＡ２の差分値の絶対値の最大値が所定値内であるため、図３（ｃ）に示されるように、フレームＡ２はそのチャンネルのサンプル列から分離され、他のフレームが前に詰められることになる。このように、１フレームが分離されると、サンプル列からは２５６サンプル削減されることになる。分離されたフレームＡ２は、そのままの値で記録されるのではなく、フレームＡ１とフレームＡ２の差分データが前記最大値を表現できる最小ビット数で記録される。フレームＡ２の情報は削除されるが、復号時にフレームＡ２の情報を復元するために、フレームＡ１とフレームＡ２の各サンプルの差分値（図３中「Ａ２−Ａ１」と表現する）がフレーム間差分データとして分離される。
【００１６】
一方、フレームＡ１とフレームＡ２の差分値の絶対値の最大値が所定値内に納まらない場合は、フレームＡ２の元のサンプル列をそのまま残すことになる。同様に、フレームＡ１とフレームＡ３、フレームＡ２とフレームＡ３、フレームＡ１とフレームＡ４、フレームＡ２とフレームＡ４、フレームＡ３とフレームＡ４、という具合に、後続するフレーム間に対しても同様の処理が行われる。このとき、フレームＡ１と類似するフレームとして削除されたフレームＡ２も後続するフレーム間差分処理において、元のサンプル列が参照される。また、差分演算処理の負荷を軽減するため、参照するフレーム間の距離は１００フレーム以内などの制限を加える。すなわち、フレームＡ１と差分演算処理を行うフレームはフレームＡ１００までとし、フレームＡ１０１以降は類似フレーム判断の対象から外す。
【００１７】
上記、フレーム間差分データは、差分処理を行った２つのフレーム番号も記録することになる。ステップＳ２において分離されたフレーム間差分データ内において同一の値が複数サンプル連続する場合は、上記チャンネル間差分データの場合と同様に、連続する部分の先頭のサンプル番号と、サンプル値、および連続するサンプル数を記録することによりデータ量をさらに削減する。この場合、連続するサンプル数に代えて最後尾のサンプル番号を記録するようにしても良い。フレームが差分データとして分離されたサンプル列は、分離されたフレーム以降のサンプル列を前に詰めることにより、全体のサンプル数が減ることになる。
【００１８】
次に、信号平坦部の処理を行う（ステップＳ３）。信号平坦部とは、本来同一の信号レベルが連続する部分のことをいう。特に信号レベルが「０」の無音部、および信号レベルの絶対値が最大の飽和部に現れることが多い。無音部は実際に無音であるか、音が非常に小さく記録されなかった場合に生じるが、飽和部は、信号の録音およびＡ／Ｄ変換の過程において生じる。無音部、飽和部またはそれ以外の同一信号レベルが連続する場合のいずれであっても、信号平坦部は、同一の信号レベルが所定の時間（所定のサンプル数）連続して記録される。このため、この部分は圧縮し易いデータになっている。本実施形態では、信号平坦部を、信号レベルが同一の値が連続する部分だけでなく、信号レベルの変化が少ない部分も含むものとする。すなわち、ステップＳ３においては、前のサンプル値との差分が所定値以下であるサンプルが連続する部分を、信号平坦部として抽出し、元のサンプル列から分離することになる。分離された信号平坦部内において同一の値が複数サンプル連続する部分（本来の信号平坦部）については、上記チャンネル間差分データ、フレーム差分データの場合と同様に、先頭のサンプル番号と、サンプル値、および連続するサンプル数を記録することによりデータ量をさらに削減する。この場合、連続するサンプル数に代えて最後尾のサンプル番号を記録するようにしても良い。信号平坦部が分離されたサンプル列は、分離された信号平坦部以降のサンプル列を前に詰めることにより、全体のサンプル数が減ることになる。例えば、図４（ａ）に網掛けで示した箇所が信号平坦部であると判断された場合、図４（ｂ）に示すようにサンプル列からは、信号平坦部に相当する部分のサンプル列が削除されて前に詰められることになる。サンプル列から削除された信号平坦部に関する情報は、信号平坦部データとして分離されることになる。
【００１９】
上記のようにして、チャンネル間の差分算出、および各チャンネルのサンプル列内の各フレームの差分算出、信号平坦部の分離によりサンプル列の削減が行われたら、残ったサンプル列を構成する各サンプルデータの上位ビットと下位ビットの分離を行う（ステップＳ４）。例えば、音響信号をＰＣＭによりデジタル化する際に、量子化ビット数１６でサンプリングした場合、各サンプルは１６ビットで表現されている。この場合、本実施形態では、上位ビット１２ビットと、下位ビット４ビットに分離する。この分離は、基本的に、Ａ／Ｄ変換機等、音響信号をデジタル化する際に用いる回路の熱雑音を分離するために行う。そのため、熱雑音であると考えられる下位ビットを分離するのである。下位ビットとして、どの程度分離するかは、音源や利用した回路の特性によっても変化するが、通常量子化ビット数の１／４程度とすることが望ましい。したがって、ここでは、１６ビットの１／４にあたる４ビットを下位ビットとして分離しているのである。
【００２０】
ここで、上位ビットと下位ビットのデータ分離の様子を図５に模式的に示す。図５において、Ｈは上位ビットもしくは上位サンプルデータを示し、Ｌは下位ビットもしくは下位サンプルデータを示す。図５（ａ）は分離前のサンプルデータである。ステップＳ４における処理により、サンプルデータは、図５（ｂ）に示す上位サンプルデータと図５（ｃ）に示す下位サンプルデータに分離されることになる。なお、上位ビットに含まれる符号ビットは、そのまま上位サンプルデータに含まれて分離される。このようにして分離されたサンプルデータは、以降別々に処理されることになる。
【００２１】
（上位サンプルの符号化）
上位サンプルデータに対しては、まず、直前の２つのサンプルを基に各サンプルの予測値と予測誤差の算出を行う（ステップＳ５）。ここで、予測誤差の算出手法について、図６を用いて説明する。例えば、サンプル値が図６（ａ）に示すような状態である場合を考えてみる。図６（ａ）において、横軸は時刻（サンプル番号）、縦軸は上位サンプル値ｘ（ｔ）である。また、各時刻における線分は、各時刻における上位サンプルｘ（ｔ）の値を示している。このような状態で、時刻ｔのサンプルにおける予測誤差ｅ（ｔ）を算出する場合、直前の時刻ｔ−１における上位サンプル値ｘ（ｔ−１）および２つ前の時刻ｔ−２における上位サンプル値ｘ（ｔ−２）を利用して以下の〔数式１〕により算出する。
【００２２】
〔数式１〕
ｅ（ｔ）＝ｘ（ｔ）−２×ｘ（ｔ−１）＋ｘ（ｔ−２）−ｅ（ｔ−１）／２
【００２３】
上記〔数式１〕において、「２×ｘ（ｔ−１）−ｘ（ｔ−２）」は過去の２つのサンプルに基づく線形予測成分である。すなわち、算出された線形予測成分、および、直前のサンプルにおいて算出された予測誤差「ｅ（ｔ−１）／２」（誤差フィードバック成分）を用いて時刻ｔにおける予測誤差ｅ（ｔ）を算出することになる。全サンプルについて、予測誤差の算出を行い、サンプル値の代わりに予測誤差が記録される。
【００２４】
これを図６（ａ）に示した上位サンプルを基に説明する。まず、誤差フィードバック成分を加えない状態で各予測誤差ｅｏ（ｔ）を算出する。図６（ｂ）に示すように、時刻ｔの予測誤差ｅｏ（ｔ）を算出する場合、直前の時刻ｔ−１における上位サンプル値ｘ（ｔ−１）および２つ前の時刻ｔ−２における上位サンプル値ｘ（ｔ−２）を結ぶ予測線が時刻ｔでとる値と、時刻ｔにおける上位サンプル値ｘ（ｔ）の差分（図中太点線で示す）に基づいて予測誤差ｅｏ（ｔ）が算出される。時刻ｔ＋１以降も同様に行って予測誤差ｅｏ（ｔ＋１）を算出する。算出された予測誤差ｅｏ（ｔ）は、図６（ｃ）に示すようになる。図６（ａ）と図６（ｃ）を比較するとわかるように値が変動する範囲が大きく狭まり、データ圧縮に都合が良くなる。
【００２５】
続いて、〔数式１〕に基づいて予測誤差ｅｏ（ｔ）に対して直前の時刻ｔ−１における補正が加わった予測誤差ｅ（ｔ−１）の５０％を減算させて、誤差フィードバック処理を加えた結果が図６（ｄ）である。図６（ｃ）と比べると、時刻ｔ＋１およびｔ＋２における予測誤差の低減が顕著である。逆に時刻ｔ＋３およびｔ＋４では予測誤差が増大しているが、平均的には予測誤差が低減し、図６（ａ）と比較すると値が変動する範囲が更に狭まり、データ圧縮効果が向上する。
【００２６】
上記ステップＳ５の処理により、各上位サンプルの値が元の値から予測誤差値に置き換えられることになるが、各ビット構成は固定長１２ビットのままである。次に、この固定長の上位サンプル列を可変長のビット構成に変換する（ステップＳ６）。そのために、まず、符号反転データの挿入を行う。具体的には、サンプル値が正の値から負の値に変化する部分に符号反転データを挿入し、負の値のサンプル値をその絶対値に置きかえる。符号反転データとしては、適当なビット配列を割り当てておく。符号反転データは後の処理で異なるビット配列に変換されるため、この時点では、他のサンプル列と区別ができるビット配列であれば良い。ただし、他のサンプル列のビット数に合わせて１２ビットで構成されるようにしておく。
【００２７】
次に、予測誤差値で記録された上位サンプルデータをより少ないデータ量で表現するために、ビット構成の変換を行う（ステップＳ７）。まず、ビット構成の変換を行うために利用するルックアップテーブルの作成を行う。具体的には、まず全時刻に渡って、各サンプル値のヒストグラムを算出する。各サンプル値は上記ステップＳ６の処理において、全て絶対値化されているので、正負の区別なくヒストグラムを算出する。その結果、サンプル絶対値の種類が６４０以上となった場合、セパレータビットを２ビット固定値「００」とし、サンプル絶対値の種類が６３９以下となった場合、セパレータビットを１ビット固定値「０」とする。さらに、出現頻度の高いサンプル絶対値から順に、少ないビット数のビットパターンを割り当てていく。この際、割り当てるビットパターンには規則が有り、最上位ビットは必ず「１」とすると共に、セパレータビットが２ビット「００」の場合は「００１」のビットパターンを含むビットパターンは禁止し、セパレータビットが１ビット「０」の場合は「０１」のビットパターンを含むビットパターンは禁止する。セパレータビットが１ビット「０」、２ビット「００」の場合のルックアップテーブルの一例を図７に示す。
【００２８】
上記のようにして作成されたルックアップテーブルを用いて、１２ビット固定長の連続する上位サンプルデータを、可変長のビットパターンに変換していく。可変長になるため、変換後の各データの区切りを区別する必要が生じる。そのため、本実施形態では、各データ間に上述のような１ビットもしくは２ビットのセパレータビットを挿入する。セパレータビットが１ビット「０」の場合、各順位のデータを表現するのにビット配列、およびビット数は、図７（ａ）に示すようになる。図７（ａ）において、順位０位は、最もビット数が少ない１ビット「１」で表現される。図７（ａ）においては、変換前ビット列は省略してあるが、実際には、最も頻繁に現れる符号反転データが「１」で表現されることになる。また、各可変長ビットには、セパレータが必ず付加されるので、順位０位のデータを表現するためには、２ビットが必要となることになる。図７の例では、セパレータビットが１ビット「０」であるため、「０１」のビットパターンは割り当てられないことになる。しかし、順位６位として示す「１０００」のビットパターンは、可変長ビットへの変換時に、直前のビットが「０」（セパレータビット）の場合に、例外的に「１０１」のビットパターンに変更することができる。このとき、直前のセパレータビットとビットパターンで「０１０１」の配列が出現する。このビット配列「０１０１」は、セパレータビットを挟んで順位０位のビット配列「１」が２つ連続した場合と考えることもできる。しかし、順位０位のビット配列「１」は符号反転データが割り当てられており、符号反転データが２つ連続することは有り得ないため、復号するためのシステムは、「１０１」ビット配列のデータであると判断することができる。これにより、順位６位のビットパターンは、セパレータビットを合わせて、５ビットから４ビットに減らすことができる。
【００２９】
また、セパレータビットが２ビット「００」の場合、各順位のデータを表現するのにビット配列、およびビット数は、図７（ｂ）に示すようになる。図７（ｂ）において、順位０位は、最もビット数が少ない１ビット「１」で表現される。上述のように、最も頻繁に現れる符号反転データが「１」で表現されることになる。また、各可変長ビットには、セパレータが必ず付加されるので、順位０位のデータを表現するためには、３ビットが必要となることになる。図７（ｂ）の例では、セパレータビットが１ビット「００」であるため、「００１」のビットパターンは割り当てられないことになる。しかし、順位１４位として示す「１００００」のビットパターンは、可変長ビットへの変換時に、直前のビットが「００」の場合に、例外的に「１００１」のビットパターンに変更することができる。このとき、直前のセパレータビットとビットパターンで「００１００１」の配列が出現する。このビット配列「００１００１」は、セパレータビットを挟んで順位０位のビット配列「１」が２つ連続した場合と考えることもできる。しかし、順位０位のビット配列「１」は符号反転データが割り当てられており、符号反転データが２つ連続することは有り得ないため、復号するためのシステムは、「１００１」ビット配列のデータであると判断することができる。これにより、順位１４位のビットパターンは、セパレータビットを合わせて、７ビットから６ビットに減らすことができる。図８（ａ）（ｂ）に、ステップＳ６によるデータ変換の様子を模式的に示す。図８（ａ）（ｂ）はいずれもサンプル列の上位部分に対応しており、図８（ａ）は固定長の上位サンプルが連続して記録されている様子を示している。図８（ａ）に示したような上位サンプル列は、図７（ａ）（ｂ）に示したルックアップテーブルを用いて図８（ｂ）に示すように変換されることになる。
【００３０】
（下位サンプルの符号化）
一方、下位サンプルデータは、そのまま連続に配置される。具体的には、上記ステップＳ４において分離された下位４ビットのデータが連続に配置されていくことになる。
【００３１】
（符号データの記録）
以上のようにして得られた符号データは、図９に示すようになる。すなわち、上位可変長サンプル列、下位固定長サンプル列、ルックアップテーブル、信号平坦部データ、フレーム間差分データ、チャンネル間差分データとなる。このデータを記録すべき記録媒体に合わせたフォーマットで記録する。
【００３２】
（復号方法）
次に、上記符号化方法により符号化された符号データを復号する方法について説明する。復号は、コンピュータおよびコンピュータに搭載される専用のソフトウェアプログラムにより実行される。復号方法の概要を図１０のフローチャートに示す。
【００３３】
まず、図９に示したような符号データを記録した記録媒体を、復号するための装置（コンピュータ）に読み込む。続いて、読み込んだデータのうち、ルックアップテーブルを参照することにより、上位可変長サンプル列から、固定長の上位固定長サンプル列すなわち線形予測誤差ｅ（ｔ）を復元してゆく（ステップＳ１１）。これにより、図８（ａ）に示したような固定長サンプル列が復元される。次に、上記〔数式１〕の左辺の項と右辺第１項を交換した式に基づいて、１２ビット固定長の上位サンプルデータｘ（ｔ）を順次復元していく（ステップＳ１２）。ステップＳ１２においては、各サンプル列は１２ビット固定長のままであるが、その値が変化することになる。続いて、復元した上位固定長サンプル列と下位固定長サンプル列を統合する（ステップＳ１３）。具体的には、上位固定長サンプル列から１２ビットを抽出し、下位固定長サンプル列から４ビットを抽出して順次統合する処理を行う。これにより、各サンプルが１６ビットのサンプル列が復元される。
【００３４】
次に、このような１６ビット固定長のサンプル列に対して、平坦部データを挿入していく（ステップＳ１４）。平坦部データの挿入は、平坦部データが有している先頭のサンプル番号を元に、サンプル列に挿入していく。これにより、図４（ａ）に示したようなサンプル列が復元される。
【００３５】
さらに、フレーム間差分データを利用して元のフレームデータを復元し、サンプル列に対して挿入していく（ステップＳ１５）。フレーム間差分データも、先頭のサンプル番号、および差分演算を行う対象としたフレームの情報を有しているので、これを利用して元のフレームを復元する。さらに復元したフレームを元のサンプル列の所定の位置に挿入する。例えば、図３の例では、フレーム間差分データ「Ａ２−Ａ１」は、自身がフレームＡ１との差分であるという情報を有しているので、フレームＡ１のサンプル列を利用してフレームＡ２を復元する。続いてフレーム間差分データ「Ａ２−Ａ１」が保有している先頭のサンプル番号を利用してサンプル列に挿入することになる。これにより、図３（ｂ）に示したようなサンプル列が復元される。
【００３６】
続いて、サンプル列に対して、チャンネル間差分データを挿入していく（ステップＳ１６）。チャンネル間差分データは、先頭のサンプル番号、最後尾のサンプル番号、元のサンプル列のチャンネル番号（上記の例ではＣｈ２）、参照したチャンネルのチャンネル番号（上記の例ではＣｈ１）を有しているので、参照チャンネルのサンプル値と差分のサンプル値とを用いて、元のチャンネルのサンプル値を復元した後、元のチャンネルのサンプル列に挿入する。これにより、図１（ａ）（ｂ）に示したようなサンプル列が復元される。この結果、アナログ信号をＰＣＭ化した状態のデジタル音響信号がデータの欠落無く復元されることになる。
【００３７】
（実現のための具体的構成）
以上、本発明による符号化方法および復号方法について説明したが、上記符号化方法は、現実には、コンピュータ等の演算処理装置で実行される。具体的には、図２のフローチャートに示したようなステップを上記手順で実行するためのプログラムをコンピュータに搭載しておく。そして、音響信号等の時系列信号をＰＣＭ方式等でデジタル化した後、コンピュータに取り込み、ステップＳ１〜ステップＳ６の処理を行った後、符号データをデジタルデータとしてコンピュータより出力して記録媒体に記録する。出力された符号データは、復号方法にしたがって復号される。具体的には、図１０のフローチャートに示したようなステップを上記手順で実行するためのプログラムをコンピュータに搭載しておく。そして、記録媒体に記録された符号データを、コンピュータに取り込み、ステップＳ１１〜ステップＳ１６の処理を行った後、デジタル音響信号等の時系列信号を復元して出力する。
【００３８】
以上、本発明の好適な実施形態について説明したが、本発明は、上記実施形態に限定されず、種々の変形が可能である。例えば、上記実施形態では、フレーム間の演算を行うにあたり、フレーム長を固定長に設定して先頭から順次決定していったが、時系列信号の特徴からフレーム長を可変にして設定するようにしても良い。
【００３９】
【発明の効果】
以上、説明したように本発明によれば、時系列のサンプル列で構成される時系列信号に対して、全てのサンプル列を再現できるように情報量を圧縮するにあたり、サンプル列の中から所定の個数のサンプル列で構成されるフレームを複数個抽出し、抽出したフレーム間で相関演算を施し、フレーム間の相関が高い場合に一方のフレームのサンプル列を、各サンプルがより少ないビット数で表現されたフレーム間差分データとしてサンプル列から分離し、分離されたフレーム間差分データ、およびフレーム間差分データの分離により残ったサンプル列を所定のフォーマットで記録するようにしたので、分離した部分を少ないビット数で表現することにより、類似した振幅のパターンが繰り返し発生する時系列信号に対して十分な圧縮を行うことが可能となるという効果を奏する。
【図面の簡単な説明】
【図１】チャンネル間の演算による差分データ分離の様子を示す図である。
【図２】本発明に係る時系列信号の符号化方法の概要を示すフローチャートである。
【図３】フレーム間の演算による差分データ分離の様子を示す図である。
【図４】平坦部データ分離の様子を示す図である。
【図５】サンプルデータの上位ビットと下位ビットの分離の様子を示す図である。
【図６】ステップＳ５における予測誤差算出処理の様子を示す図である。
【図７】ビット長の変換に用いるルックアップテーブルを示す図である。
【図８】上位サンプルのビット長に変換を模式的に示した図である。
【図９】本発明に係る時系列信号の符号化装置により得られる符号データを示す図である。
【図１０】本発明に係る時系列信号の復号方法の概要を示すフローチャートである。[0001]
[Industrial applications]
The present invention relates to the field of music production, such as music production, storage of audio data materials, and relay of location materials, particularly to the field of producing high-definition audio with higher quality than CDs, and audio recording using digital recording media such as CDs and DVDs. The present invention relates to a reversible data compression technique suitable in a field where reproduction of data, a transmission of a biological signal in telemedicine, and the like where data modification is disliked, and the like are performed.
[0002]
[Prior art]
Conventionally, various methods have been used for compressing an acoustic signal. MP3 (MPEG-1 / Layer3), AAC (MPEG-2 / Layer3) and the like have been put to practical use as a technique for compressing and encoding an audio signal. With such a compression encoding method, it is possible to treat an audio signal as small data, which contributes to the efficiency of data recording and transmission.
[0003]
[Problems to be solved by the invention]
MP3, AAC, and the like as described above are all called lossy coding schemes, and can be efficiently compressed. However, decoding involves a considerable deterioration in quality and completely reproduces the original signal. It is not possible. For this reason, in the music production field such as music production, material storage, and location material relay, these encoding methods cannot be applied, and although inefficient, non-compressed storage / transmission methods are used. In particular, recently, the number of productions that handle high-definition audio has increased, the material capacity has become enormous, and this has become a problem in managing work disks.
[0004]
In order to solve the above-described problems, the present applicant performs a difference operation between channels and between frames on a sample sequence of a time-series signal, reduces the value of each sample, and uses predictive coding. We propose a method for compressing data. (See Patent Document 1).
[0005]
[Patent Document 1]
Japanese Patent Application No. 2002-231150
[0006]
However, in the method proposed in the above-mentioned application, when the time series signal is composed of a plurality of channels like a stereo sound signal, if the difference processing is performed over the entire section of the time series, the signal amplitude may increase. May occur, and the subsequent encoding using the prediction error may rather increase the data amount. Further, when there is a correlation between frames which are a unit of samples at different times of the same channel, the signal amplitude is reduced by performing a difference operation between the frames, but the ratio of unpredictable noise components is increased. However, the subsequent encoding using the prediction error may rather increase the data amount. This is particularly likely to occur when a signal having a similar amplitude pattern (so-called signal waveform) is repeatedly generated. In addition, a portion having the same signal level is separated as a signal flat portion so as to reduce the data amount. However, since there are not so many acoustic materials in which portions having the same signal level continue, the compression effect is small.
[0007]
Therefore, in order to solve these problems, the present invention can perform sufficient compression on a time-series signal in which a pattern having a similar amplitude is repeatedly generated. An object of the present invention is to provide an encoding method and a decoding method of a time-series signal of a reversible compression scheme that can be completely decoded.
[0008]
[Means for Solving the Problems]
In order to solve the above problem, the present invention provides a coding method for compressing the amount of information so that all the sample sequences can be reproduced with respect to a time-series signal composed of a time-series sample sequence. , A plurality of frames composed of a predetermined number of sample sequences are extracted, and a correlation operation is performed between the extracted frames.When the correlation between the frames is high, the sample sequence of one frame is extracted. As an inter-frame difference data expressed by a small number of bits, an inter-frame calculation step of separating from the sample sequence, the separated inter-frame difference data, and a sample sequence remaining after the separation of the inter-frame difference data are performed in a predetermined format. The recording step is executed.
[0009]
According to the present invention, for a time-series sample sequence, a portion having a high correlation in the sample sequence is separated as difference data, and the separated portion is represented by a small number of bits, so that a pattern having a similar amplitude is obtained. Sufficient compression can be performed on time series signals that occur repeatedly.
[0010]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
(Data structure)
First, a time-series signal to be encoded in the time-series signal encoding method according to the present invention will be described. In the present embodiment, a case where an acoustic signal is applied as a time-series signal will be described as an example. FIG. 1A is a diagram schematically illustrating an acoustic signal handled in the present invention. In FIG. 1, the left-right direction is a time-series direction, and the time advances toward the right side. That is, the left end is the start time, and the right end is the end time. The acoustic signal shown in FIG. 1 is data of two channels, Ch1 and Ch2, in which an L (left) signal is recorded in Ch1 and an R (right) signal is recorded in Ch2.
[0011]
In order to obtain a digital audio signal as shown in FIG. 1A, first, an analog audio signal which is a time-series signal is digitized. This can be done by using a conventional general PCM technique, sampling this analog audio signal at a predetermined sampling frequency, and converting the amplitude into digital data using a predetermined number of quantization bits. In the present embodiment, a description will be given below on the assumption that positive and negative signs are recorded at a sampling frequency of 44.1 KHz and a quantization bit number of 16 bits. Sampling at a sampling frequency of 44.1 KHz results in a sample sequence composed of 44100 samples per second. Here, since the audio signal is composed of a plurality of channels, digitization is performed for each channel.
[0012]
(Encoding method)
Next, an outline of a time-series signal encoding method according to the present invention will be described. FIG. 2 is a flowchart showing an outline of a time-series signal encoding method according to the present invention. First, a difference operation between channels is performed on the sample sequence as the digital audio signal shown in FIG. 1 (step S1). Specifically, first, the difference calculation of the sample data at the same time between the entire section of the channel Ch1 and the entire section of the channel Ch2 is performed. As a result, the difference data of the section on the channel Ch2 where the difference is equal to or less than the predetermined threshold is separated from the sample sequence of the channel Ch2 and separately recorded as inter-channel difference data. In the present embodiment, the predetermined threshold value is set within the lower 4 bits. The expression “within the lower 4 bits” is a value from −8 to +7 as a decimal number when expressed with positive and negative signs. As a result of the difference calculation, a section in which the value of the sample sequence takes a value of -8 to +7 is recorded as inter-channel difference data. When the same value continues for a plurality of samples in the inter-channel difference data, the data amount is further reduced by recording the first sample number of the continuous portion, the sample value, and the number of continuous samples. In this case, the last sample number may be recorded instead of the number of consecutive samples.
[0013]
In the channel Ch2 from which the inter-channel difference data has been separated, the total number of samples is reduced by packing the sample sequence after the separated section forward. For example, when samples whose difference values are equal to or less than the threshold value are continuous in the section shown in FIG. 1B, this section is separated, and the sample sequence of the channel Ch2 and the inter-channel difference data are respectively shown in FIG. ). The number of samples of the channel Ch1 (represented by the length on the left and right in the figure) does not change, but the number of samples of the channel Ch2 decreases by the amount of the separated difference data. In addition, in the example of FIG. 1, only one section of the inter-channel difference data is separated, but in reality, many sections are separated. As a method of compressing data between channels, there is also a method of performing a difference operation over the entire section and recording the difference value as a sample value of one channel. However, in the case of such a method, if the original sound signal is recorded equally to the left and right, such as a vocal sound, the compression efficiency is high, but the sound is recorded around one of the channels like a musical instrument sound. However, the signal amplitude becomes large, and as a result of using the predictive coding performed later, the data amount may be rather increased. Therefore, in the present embodiment, a method is employed in which only sections having a small difference are separated as difference data.
[0014]
Subsequently, a frame having a predetermined section length is set for the sample sequence of each channel on which the difference calculation process between channels has been performed, and the calculation between the set frames is performed (step S2). First, the similarity of the sample sequence constituting each frame is determined, and similar frames are selected. In the present embodiment, the frame length is fixed over the entire section from the start time to the end time of the sample sequence. Specifically, one frame has 256 samples. 256 samples are extracted from the beginning of the channel data (sample sequence constituting the channel) each as one frame, and the similarity of each frame is obtained. Since the similarity between frames means that the correlation between both signals is obtained, various methods for performing the correlation calculation can be used. In the present embodiment, among the 256 samples in each frame, the other frames are used. The maximum value of the absolute value of the difference value from the corresponding sample in is extracted, and a frame pair whose maximum value falls within a predetermined value is selected as one similar frame. This process is performed over the entire section of the sample sequence. FIGS. 3A to 3C show how the sample sequence changes due to the inter-frame arithmetic processing. Although only one channel is shown in FIG. 3 unlike FIG. 1, the other channels are processed in the same manner. First, as shown in FIG. 3A, the sample sequence framed into a fixed length is divided into frames A1, A2, A3,.
[0015]
Subsequently, a difference is calculated for each frame. In this embodiment, 256 difference values are obtained for each sample time. If the maximum absolute value of the obtained difference values is within a predetermined value, the sample sequence after the difference processing of the frame is separated from the sample sequence of each channel and recorded as difference data. For example, as shown in FIG. 3B, when processing is performed on the frames A1 and A2, the preceding frame A1 remains as it is, but the maximum absolute value of the difference value between the frames A1 and A2 is maintained. Since the value is within the predetermined value, as shown in FIG. 3 (c), the frame A2 is separated from the sample sequence of the channel, and another frame is stuffed. Thus, when one frame is separated, 256 samples are reduced from the sample sequence. The separated frame A2 is not recorded with the value as it is, but the difference data between the frame A1 and the frame A2 is recorded with the minimum number of bits capable of expressing the maximum value. Although the information of the frame A2 is deleted, in order to restore the information of the frame A2 at the time of decoding, a difference value (expressed as “A2-A1” in FIG. 3) between the samples of the frame A1 and the frame A2 is determined by an inter-frame difference. Separated as data.
[0016]
On the other hand, if the maximum absolute value of the difference between the frame A1 and the frame A2 does not fall within the predetermined value, the original sample sequence of the frame A2 is left as it is. Similarly, the same processing is performed between the subsequent frames, such as frame A1 and frame A3, frame A2 and frame A3, frame A1 and frame A4, frame A2 and frame A4, and frame A3 and frame A4. Is At this time, the original sample sequence is also referred to in the subsequent inter-frame difference processing for the frame A2 deleted as a frame similar to the frame A1. Further, in order to reduce the load of the difference calculation processing, the distance between the frames to be referred to is limited to 100 frames or less. That is, the frame on which the difference calculation process is performed with the frame A1 is up to the frame A100, and the frames after the frame A101 are excluded from the similar frame determination.
[0017]
The above-mentioned inter-frame difference data also records two frame numbers subjected to the difference processing. If the same value continues for a plurality of samples in the inter-frame difference data separated in step S2, as in the case of the inter-channel difference data, the first sample number of the continuous part, the sample value, and the continuous The amount of data is further reduced by recording the number of samples. In this case, the last sample number may be recorded instead of the number of consecutive samples. As for the sample sequence in which the frame is separated as the difference data, the total number of samples is reduced by packing the sample sequence after the separated frame in front.
[0018]
Next, the signal flat portion is processed (step S3). The signal flat portion is a portion where the same signal level is originally continuous. In particular, it often occurs in a silent part where the signal level is “0” and in a saturated part where the absolute value of the signal level is the maximum. Silence occurs when the sound is actually silent or when the sound is not recorded very small, while saturation occurs during the signal recording and A / D conversion. Regardless of whether a silent portion, a saturated portion, or the other same signal level continues, the same signal level is continuously recorded for a predetermined time (a predetermined number of samples) in the signal flat portion. For this reason, this part is data that can be easily compressed. In the present embodiment, it is assumed that the signal flat portion includes not only a portion where the signal level has the same value but also a portion where the signal level does not change much. That is, in step S3, a portion in which samples whose difference from the previous sample value is equal to or smaller than the predetermined value is continuous is extracted as a signal flat portion and separated from the original sample sequence. As for the portion where the same value continues for a plurality of samples (original signal flat portion) in the separated signal flat portion, the head sample number, sample value, Further, the data amount is further reduced by recording the number of consecutive samples. In this case, the last sample number may be recorded instead of the number of consecutive samples. As for the sample sequence from which the signal flat portion is separated, the number of samples is reduced by packing the sample sequence after the separated signal flat portion forward. For example, when it is determined that a portion shaded in FIG. 4A is a signal flat portion, as shown in FIG. 4B, a sample sequence corresponding to the signal flat portion is extracted from the sample sequence. Will be deleted and padded before. Information on the signal flat portion deleted from the sample sequence is separated as signal flat portion data.
[0019]
As described above, after the difference between channels is calculated, the difference between each frame in the sample sequence of each channel is calculated, and the sample sequence is reduced by separating the signal flat portion, each sample constituting the remaining sample sequence is performed. The upper bits and the lower bits of the data are separated (step S4). For example, when digitizing an audio signal by PCM, when sampling is performed with a quantization bit number of 16, each sample is represented by 16 bits. In this case, in this embodiment, the upper bits are separated into 12 bits and the lower bits into 4 bits. This separation is basically performed to separate the thermal noise of a circuit used for digitizing an acoustic signal, such as an A / D converter. Therefore, lower bits considered to be thermal noise are separated. The degree to which the lower bits are separated depends on the characteristics of the sound source and the circuit used, but it is usually desirable to set the number of quantization bits to about 1/4. Therefore, in this case, 4 bits corresponding to 1/4 of 16 bits are separated as lower bits.
[0020]
Here, the state of data separation of upper bits and lower bits is schematically shown in FIG. In FIG. 5, H indicates upper bits or upper sample data, and L indicates lower bits or lower sample data. FIG. 5A shows sample data before separation. By the processing in step S4, the sample data is separated into the upper sample data shown in FIG. 5B and the lower sample data shown in FIG. 5C. Note that the sign bit included in the upper bits is directly included in the upper sample data and separated. The sample data separated in this way will be separately processed thereafter.
[0021]
(Encoding of upper sample)
For the upper sample data, first, a prediction value and a prediction error of each sample are calculated based on the two immediately preceding samples (step S5). Here, a calculation method of the prediction error will be described with reference to FIG. For example, consider a case where the sample values are in a state as shown in FIG. In FIG. 6A, the horizontal axis is the time (sample number), and the vertical axis is the upper sample value x (t). The line segment at each time indicates the value of the upper sample x (t) at each time. In this state, when calculating the prediction error e (t) in the sample at time t, the upper sample value x (t-1) at the immediately preceding time t-1 and the upper sample value x (t-1) at the immediately preceding time t-2. It is calculated by the following [Equation 1] using the value x (t-2).
[0022]
[Formula 1]
e (t) = x (t) −2 × x (t−1) + x (t−2) −e (t−1) / 2
[0023]
In the above [Equation 1], “2 × x (t−1) −x (t−2)” is a linear prediction component based on two past samples. That is, the prediction error e (t) at the time t is calculated using the calculated linear prediction component and the prediction error “e (t−1) / 2” (error feedback component) calculated in the immediately preceding sample. Will be. The prediction error is calculated for all the samples, and the prediction error is recorded instead of the sample value.
[0024]
This will be described based on the upper sample shown in FIG. First, each prediction error eo (t) is calculated without adding an error feedback component. As shown in FIG. 6B, when calculating the prediction error eo (t) at the time t, the upper sample value x (t-1) at the immediately preceding time t-1 and at the immediately preceding time t-2. The prediction error eo (t) is based on the difference between the value taken by the prediction line connecting the upper sample values x (t−2) at time t and the upper sample values x (t) at time t (indicated by the thick dotted line in the figure). Is calculated. The same operation is performed after time t + 1 to calculate the prediction error eo (t + 1). The calculated prediction error eo (t) is as shown in FIG. As can be seen from a comparison between FIG. 6A and FIG. 6C, the range in which the value fluctuates is greatly narrowed, and data compression becomes more convenient.
[0025]
Subsequently, 50% of the prediction error e (t-1) obtained by adding the correction at the immediately preceding time t-1 to the prediction error eo (t) based on [Equation 1] is subtracted, and the error feedback processing is performed. FIG. 6D shows the added result. The prediction error at times t + 1 and t + 2 is remarkably reduced as compared with FIG. Conversely, the prediction error increases at times t + 3 and t + 4, but the prediction error decreases on average, and the range in which the value fluctuates is further narrowed as compared with FIG. 6A, thereby improving the data compression effect.
[0026]
By the process in step S5, the value of each upper sample is replaced with the prediction error value from the original value, but each bit configuration remains fixed-length 12 bits. Next, the fixed-length upper sample sequence is converted into a variable-length bit configuration (step S6). For this purpose, first, sign-inverted data is inserted. Specifically, sign-inverted data is inserted in a portion where the sample value changes from a positive value to a negative value, and the negative sample value is replaced with its absolute value. An appropriate bit array is assigned as the sign-inverted data. Since the sign-inverted data is converted into a different bit array in the subsequent processing, at this point, any bit array that can be distinguished from other sample strings may be used. However, it is configured to have 12 bits in accordance with the number of bits of the other sample strings.
[0027]
Next, in order to represent the higher-order sample data recorded with the prediction error value with a smaller data amount, conversion of the bit configuration is performed (step S7). First, a look-up table used for converting the bit configuration is created. Specifically, first, a histogram of each sample value is calculated over the entire time. Since all the sample values have been converted into absolute values in the process of step S6, the histogram is calculated without distinguishing between positive and negative. As a result, when the type of the sample absolute value is 640 or more, the separator bit is set to the fixed value of 2 bits “00”. When the type of the sample absolute value is 639 or less, the separator bit is set to the fixed value of 1 bit “0”. ". Furthermore, a bit pattern with a smaller number of bits is assigned in order from the sample absolute value having the highest appearance frequency. At this time, there is a rule for the bit pattern to be allocated. The most significant bit is always set to “1”, and when the separator bit is 2 bits “00”, the bit pattern including the bit pattern of “001” is prohibited, and When one bit is “0”, a bit pattern including a bit pattern of “01” is prohibited. FIG. 7 shows an example of a look-up table when the separator bits are 1 bit “0” and 2 bits “00”.
[0028]
Using the lookup table created as described above, continuous high-order sample data having a fixed length of 12 bits is converted into a variable-length bit pattern. Since the length is variable, it is necessary to distinguish the breaks of each data after conversion. Therefore, in the present embodiment, the above-described 1-bit or 2-bit separator bit is inserted between each data. When the separator bit is one bit “0”, the bit arrangement and the number of bits for expressing the data of each rank are as shown in FIG. 7A. In FIG. 7A, the 0th place is represented by 1 bit “1” having the smallest number of bits. In FIG. 7A, the bit string before conversion is omitted, but the sign-inverted data that appears most frequently is actually represented by “1”. Also, since a separator is always added to each variable-length bit, two bits are required to represent the data of the 0th rank. In the example of FIG. 7, since the separator bit is one bit “0”, the bit pattern “01” is not assigned. However, the bit pattern of “1000” shown as the sixth place is exceptionally changed to the bit pattern of “101” when the immediately preceding bit is “0” (separator bit) during conversion to variable-length bits. be able to. At this time, an array of “0101” appears with the immediately preceding separator bit and bit pattern. This bit array “0101” can be considered as a case where two bit arrays “1” of the 0th rank are consecutive with the separator bit interposed therebetween. However, the bit array “1” of the 0th rank is assigned with sign-inverted data, and it is impossible that two sign-inverted data are consecutive. Therefore, the decoding system uses “101” bit array data. It can be determined that there is. Thus, the bit pattern of the sixth place can be reduced from 5 bits to 4 bits by combining the separator bits.
[0029]
When the separator bits are two bits “00”, the bit arrangement and the number of bits for expressing the data of each rank are as shown in FIG. 7B. In FIG. 7B, the 0th place is represented by 1 bit “1” having the smallest number of bits. As described above, the sign-reversed data that appears most frequently is represented by “1”. In addition, since a separator is always added to each variable-length bit, three bits are required to represent the data of the 0th rank. In the example of FIG. 7B, since the separator bit is one bit “00”, the bit pattern “001” is not allocated. However, the bit pattern of “10000” shown as the 14th place can be exceptionally changed to the bit pattern of “1001” when the immediately preceding bit is “00” during conversion to variable-length bits. At this time, an array of “001001” appears with the immediately preceding separator bit and bit pattern. This bit array “001001” can also be considered as a case where two bit arrays “1” of the 0th rank are consecutive with the separator bit interposed therebetween. However, the bit array “1” of the 0th rank is assigned with sign-inverted data, and it is impossible that two sign-inverted data are consecutive. Therefore, the decoding system uses the data of “1001” bit array. It can be determined that there is. Thereby, the bit pattern of the 14th place can be reduced from 7 bits to 6 bits by combining the separator bits. FIGS. 8A and 8B schematically show how data is converted in step S6. 8 (a) and 8 (b) correspond to the upper part of the sample sequence, and FIG. 8 (a) shows a state in which upper samples of a fixed length are continuously recorded. The upper sample sequence as shown in FIG. 8A is converted as shown in FIG. 8B using the look-up tables shown in FIGS. 7A and 7B.
[0030]
(Encoding of lower sample)
On the other hand, the lower sample data is arranged continuously as it is. Specifically, the lower 4 bits of data separated in step S4 are arranged continuously.
[0031]
(Recording of code data)
The code data obtained as described above is as shown in FIG. That is, there are an upper variable length sample sequence, a lower fixed length sample sequence, a lookup table, signal flat portion data, inter-frame difference data, and inter-channel difference data. This data is recorded in a format suitable for the recording medium to be recorded.
[0032]
(Decryption method)
Next, a method of decoding the encoded data encoded by the encoding method will be described. The decryption is executed by a computer and a dedicated software program mounted on the computer. An outline of the decoding method is shown in a flowchart of FIG.
[0033]
First, a recording medium on which code data as shown in FIG. 9 is recorded is read into an apparatus (computer) for decoding. Subsequently, the fixed-length upper fixed-length sample sequence, that is, the linear prediction error e (t) is restored from the higher-order variable-length sample sequence by referring to the lookup table in the read data (step S11). . As a result, a fixed-length sample sequence as shown in FIG. 8A is restored. Next, 12-bit fixed-length high-order sample data x (t) is sequentially restored based on the equation obtained by exchanging the left-hand term and the right-hand first term of [Equation 1] (step S12). In step S12, each sample sequence remains at a fixed length of 12 bits, but its value changes. Subsequently, the restored upper fixed length sample sequence and lower fixed length sample sequence are integrated (step S13). Specifically, a process of extracting 12 bits from the upper fixed-length sample sequence, extracting 4 bits from the lower fixed-length sample sequence, and sequentially integrating them is performed. As a result, a sample sequence in which each sample has 16 bits is restored.
[0034]
Next, flat portion data is inserted into such a 16-bit fixed-length sample sequence (step S14). The flat part data is inserted into the sample sequence based on the first sample number of the flat part data. Thus, the sample sequence as shown in FIG. 4A is restored.
[0035]
Further, the original frame data is restored using the inter-frame difference data, and inserted into the sample sequence (step S15). Since the inter-frame difference data also has information on the head sample number and the frame on which the difference calculation is to be performed, the original frame is restored using this information. Further, the restored frame is inserted into a predetermined position of the original sample sequence. For example, in the example of FIG. 3, since the inter-frame difference data “A2-A1” has information indicating that it is the difference from the frame A1, the frame A2 is restored using the sample sequence of the frame A1. I do. Subsequently, the data is inserted into the sample sequence using the head sample number held by the inter-frame difference data “A2-A1”. Thus, the sample sequence as shown in FIG. 3B is restored.
[0036]
Subsequently, the channel difference data is inserted into the sample sequence (step S16). The inter-channel difference data includes the first sample number, the last sample number, the channel number of the original sample sequence (Ch2 in the above example), and the channel number of the referenced channel (Ch1 in the above example). Therefore, the sample value of the original channel is restored using the sample value of the reference channel and the sample value of the difference, and then inserted into the sample sequence of the original channel. Thus, the sample sequence as shown in FIGS. 1A and 1B is restored. As a result, the digital audio signal in which the analog signal is converted into the PCM is restored without data loss.
[0037]
(Specific configuration for realization)
Although the encoding method and the decoding method according to the present invention have been described above, the encoding method is actually executed by an arithmetic processing device such as a computer. Specifically, a program for executing the steps shown in the flowchart of FIG. 2 in the above procedure is installed in a computer. Then, after time-series signals such as acoustic signals are digitized by the PCM method or the like, the signals are captured by a computer, and the processes of steps S1 to S6 are performed. Code data is output as digital data from the computer and recorded on a recording medium. I do. The output code data is decoded according to a decoding method. Specifically, a program for executing the steps shown in the flowchart of FIG. 10 in the above procedure is installed in a computer. Then, the code data recorded on the recording medium is taken into a computer, and after performing the processes of steps S11 to S16, a time-series signal such as a digital acoustic signal is restored and output.
[0038]
As described above, the preferred embodiments of the present invention have been described, but the present invention is not limited to the above embodiments, and various modifications are possible. For example, in the above-described embodiment, when performing an inter-frame operation, the frame length is set to a fixed length and sequentially determined from the beginning. However, the frame length is set to be variable based on the characteristics of the time-series signal. May be.
[0039]
【The invention's effect】
As described above, according to the present invention, a time-series signal composed of a time-series sample sequence is used to compress an information amount so that all sample sequences can be reproduced. A plurality of frames composed of the sample sequence of the number are extracted, a correlation operation is performed between the extracted frames, and when the correlation between the frames is high, the sample sequence of one frame is reduced with a smaller number of bits for each sample. Separated from the sample sequence as the expressed inter-frame difference data, and the separated inter-frame difference data, and the sample sequence remaining after the separation of the inter-frame difference data are recorded in a predetermined format. By expressing with a small number of bits, sufficient compression is performed on time-series signals in which patterns with similar amplitudes occur repeatedly. There is an effect that it is possible.
[Brief description of the drawings]
FIG. 1 is a diagram showing how differential data is separated by calculation between channels.
FIG. 2 is a flowchart showing an outline of a time-series signal encoding method according to the present invention.
FIG. 3 is a diagram showing how differential data is separated by calculation between frames.
FIG. 4 is a diagram illustrating a state of flat part data separation.
FIG. 5 is a diagram showing how upper bits and lower bits of sample data are separated.
FIG. 6 is a diagram illustrating a state of a prediction error calculation process in step S5.
FIG. 7 is a diagram showing a lookup table used for bit length conversion.
FIG. 8 is a diagram schematically showing conversion to a bit length of an upper sample.
FIG. 9 is a diagram showing code data obtained by the time-series signal coding apparatus according to the present invention.
FIG. 10 is a flowchart showing an outline of a time-series signal decoding method according to the present invention.

Claims

An encoding method for compressing the amount of information so that all the sample sequences can be reproduced for a time-series signal composed of a time-series sample sequence,
A plurality of frames composed of a predetermined number of sample sequences are extracted from the sample sequence, a correlation operation is performed between the extracted frames, and when the correlation between the frames is high, a sample sequence of one frame is extracted. An inter-frame calculation step of separating the sample from the sample sequence as inter-frame difference data expressed by a smaller number of bits;
Recording the separated inter-frame difference data, and a sample sequence remaining after separation of the inter-frame difference data in a predetermined format;
A coding method for a time-series signal, comprising:

In claim 1,
After the inter-frame operation step,
In the sample sequence remaining after the separation of the inter-frame difference data, a section in which the value of a sample is continuously within a predetermined range is defined as signal flat portion data in which each sample is represented by a smaller number of bits. A method for encoding a time-series signal, comprising a signal flat part separation step of separating a signal from a column.

In claim 1 or claim 2,
After the inter-frame calculation step or the signal flat part separation step,
A time-series signal of a time-series signal, comprising a sample encoding step of encoding each sample data of the remaining sample sequence using a prediction error from a temporally past sample sequence. Encoding method.

In claim 3,
Before the sample encoding step,
Upper / lower separation that divides each bit data constituting the sample sequence at a predetermined bit position and separates into upper sample data composed of a sample sequence of upper bits and lower sample data composed of a sample sequence of lower bits Having a stage,
The encoding method of a time-series signal, wherein the sample encoding step encodes the higher-order sample data using a prediction error from a temporally past sample sequence.

In claim 3 or claim 4,
The sample encoding step includes:
A prediction error calculating step of updating each sample value as a new value using the prediction error calculated using the immediately preceding two samples of each sample value;
A bit length conversion step of converting each fixed-length upper-order sample data recorded as a prediction error value into a variable-length sample data;
A time-series signal encoding method characterized by having:

In claim 5,
The bit length conversion step includes:
A look-up table creating step of creating a look-up table described with a minimum bit length such that the most significant bit of the converted sample data becomes 1 based on the histogram of the target bit data;
Bit data conversion means for performing conversion using the lookup table for the target bit data,
A bit data arranging step of arranging the bit data so as to insert a predetermined number of bits of divisional bit data between the converted bit data;
A coding method for a time-series signal, comprising:

In any one of claims 1 to 6,
The time-series signal is configured by a plurality of channels having a sample sequence,
Before the inter-frame operation step,
The method further comprises performing a predetermined operation on samples between channels, and separating a sample sequence of a portion having a high correlation between channels as inter-channel difference data from a sample sequence of one channel. Time-series signal encoding method.

A recording medium which records a data group obtained by the method for encoding a time-series signal according to any one of claims 1 to 7 for a given time-series signal.

A decoding method for decoding code data obtained by compression-encoding a time-series signal and reproducing all sample sequences of the time-series signal,
Obtaining a sample sequence restored to independent sample values for each time from the sample values recorded in the prediction error,
Integrating bit data constituting each sample of the restored sample sequence and lower bit data;
Signal flat portion insertion step of inserting signal flat portion data into each integrated sample sequence,
Restoring a sample sequence of the original frame based on the inter-frame difference data, and a frame data restoring step of inserting the sample sequence into the sample sequence;
A method for decoding a time-series signal, comprising:

In claim 9,
The time-series signal is configured by a plurality of channels having a sample sequence,
After the frame data restoration step,
A method for decoding a time-series signal, further comprising restoring a sample sequence of an original channel based on the inter-channel difference data and inserting the sample sequence into the sample sequence.

An encoding program for compressing the amount of information so that all of the sample sequences can be reproduced for a time-series signal composed of a time-series sample sequence,
On the computer,
A plurality of frames composed of a predetermined number of sample sequences are extracted from the sample sequences, and a correlation operation is performed between the frames. When the correlation between the frames is high, the sample sequence of one frame is compared with the inter-frame difference. As data, an inter-frame operation step of separating from the sample sequence,
In the sample sequence, a section in which the value of the sample is continuously within a predetermined range, as signal flat portion data, a signal flat portion separating step of separating from the sample sequence,
Upper / lower separation that divides each bit data constituting the sample sequence at a predetermined bit position and separates into upper sample data composed of a sample sequence of upper bits and lower sample data composed of a sample sequence of lower bits Stages,
An upper sample encoding step for encoding the upper sample data using a prediction error from a temporally past sample sequence,
The program to execute.

A program that decodes encoded data obtained by compression-encoding a time-series signal and reproduces all sample sequences of the time-series signal,
On the computer,
Obtaining a sample sequence restored to independent sample values for each time from the sample values recorded in the prediction error,
Integrating the bit data constituting each of the restored samples and lower bit data;
Signal flat portion insertion step of inserting signal flat portion data into each integrated sample sequence;
Restoring a sample sequence of the original frame based on the inter-frame difference data, and restoring a frame data to be inserted into the sample sequence;
The program to execute.