JP4357852B2

JP4357852B2 - Time series signal compression analyzer and converter

Info

Publication number: JP4357852B2
Application number: JP2003045371A
Authority: JP
Inventors: 敏雄茂出木
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2003-02-24
Filing date: 2003-02-24
Publication date: 2009-11-04
Anticipated expiration: 2023-02-24
Also published as: JP2004258059A

Abstract

<P>PROBLEM TO BE SOLVED: To provide a compression analyzing device and a converting device for a time-series signal that can perform efficient irreversible compression of the time-series signal and analyze the compression precision of compressed data. <P>SOLUTION: After values of respective samples are converted into predicted error values by linearly predicting the time-series signal consisting of a time-series sample array, high-order bits and low-order bits of the respective samples are separated; and high-order bit components are encoded with variable length and low-order bit components are encoded with fixed length to perform compression. The data amount of the low-order bit components is estimated as a quantization noise component and calculated and the data amount of other data generated in the compressing process is calculated; and the ratio to the data amount of the original time-series signal is calculated and the data ratios before and after the compression are comparatively displayed. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【産業上の利用分野】
本発明は、音楽制作、音響データの素材保管、ロケ素材の中継など音楽制作分野、特に音響信号の分析・分類や音響の加工による特殊効果分野、遠隔医療における生体信号の解析・診断等の分野において好適なデータ圧縮の解析技術に関する。
【０００２】
【従来の技術】
従来より、音響信号の圧縮には様々な手法が用いられている。音響信号を圧縮して符号化する手法として、ＭＰ３（ＭＰＥＧ−１／Ｌａｙｅｒ３）、ＡＡＣ（ＭＰＥＧ−２／Ｌａｙｅｒ３）などが実用化されている。このような圧縮符号化方式により、音響信号を小さいデータとして扱うことが可能となり、データの記録・伝送の効率化に貢献している。
【０００３】
上述のようなＭＰ３、ＡＡＣ等はいずれもロッシー符号化方式といわれるものであり、効率的な圧縮が可能であるが、復号化にあたって、少なからず品質の劣化を伴い、原信号を完全に再現することはできない。そのため、音楽制作、素材保管、ロケ素材の中継など音楽制作分野では、これらの符号化方式を適用できず、非効率ではあるが、非圧縮で保存・伝送する方式がとられている。特に最近は高精細オーディオを扱うプロダクションが増え、素材容量が膨大になり、ワークディスクを管理する上で問題になってきている。
【０００４】
最近では、上記問題を解決するため、音響信号を可逆圧縮符号化する方法についても、様々なものが提案されている（例えば、特許文献１参照）。
【０００５】
【特許文献１】
特開２００２−２７８６００号公報
【０００６】
【発明が解決しようとする課題】
しかしながら、現状では、元のデータがどの程度まで圧縮されたかという圧縮率を測定することはできるが、圧縮した場合に元の信号のどの成分がどの程度圧縮されたかということを知ることはできない。
【０００７】
そこで、このような問題を解決するため、本発明は、時系列信号を効率的に可逆圧縮すると共に、圧縮されたデータの圧縮精度を解析することが可能な時系列信号の圧縮解析装置および変換装置を提供することを課題とする。
【０００８】
【課題を解決するための手段】
上記課題を解決するため、本発明では、時系列のサンプル列で構成される時系列信号に対して、前記全てのサンプル列を再現できるように情報量を圧縮すると共に圧縮した情報を解析する装置を、前記サンプル列の各サンプルの値を、時間的に過去の複数のサンプルからの予測誤差値に変換する予測誤差変換手段、前記予測誤差値に変換された各サンプル値を表現するビットデータを分断する位置を設定し、設定されたビット位置で分断し、上位ビットのサンプル列で構成される上位サンプル列と、下位ビットのサンプル列で構成される下位サンプル列とに分離するデータ分離手段、前記上位サンプル列に対しては、可変長符号で符号化を行うようにした上位サンプル符号化手段、前記下位サンプル列に対しては、固定長符号で符号化を行うようにした下位サンプル符号化手段、前記時系列信号の中で前記上位サンプル列に対応するデータの割合と、前記下位サンプル列に対応するデータの割合と、前記上位サンプル符号化手段で符号化されたデータの割合と、前記下位サンプル符号化手段で符号化されたデータの割合と、を表示するデータ表示手段を備えた構成としたことを特徴とする。
【０００９】
本発明によれば、時系列信号を、予測誤差変換、上下位ビットを分離して圧縮を行うと共に、圧縮後の符号データに含まれる各データの割合を表示すると共に、圧縮前の各データの割合を分析して表示するようにしたので、時系列信号を効率的に可逆圧縮すると共に、圧縮されたデータの圧縮精度を解析することが可能となる。
【００１０】
【発明の実施の形態】
以下、本発明の実施形態について図面を参照して詳細に説明する。
（装置構成）
図１は、本発明に係る時系列信号の符号化装置の一実施形態を示す構成図である。図１において、１は時系列信号入力手段、２は記憶手段、３は分析手段、４は表示手段、５は分離位置設定手段、６は音響信号変換手段、７は音声出力手段、１０は信号平坦部処理手段、２０は予測誤差変換手段、３０はチャンネル間演算手段、４０は相関フレーム検出手段、５０はデータ分離手段、６０は上位サンプル符号化手段、７０は下位サンプル符号化手段である。
【００１１】
図１において、時系列信号入力手段１はデジタル音響信号等のデジタル化された音響信号を入力する機能を有している。記憶手段２は、本装置により作成される各種データを記憶する機能を有している。分析手段３は、作成された各データを分析して本装置による圧縮効率を示すデータを作成する機能を有している。表示手段４は、分析手段３により分析されたデータを表示する機能を有している。分離位置設定手段５は、データ分離手段５０に対して、その分離位置を設定する機能を有している。音響信号変換手段６は、記憶手段に記憶された各種データを各サンプルを固定ビット数に変換した後、Ｄ／Ａ変換、増幅して音響信号として出力可能な状態に変換する機能を有している。音声出力手段７は、アナログ音響信号に変換されたデータを音として出力する機能を有する。音声出力手段７は具体的には、スピーカーで実現される。
【００１２】
信号平坦部処理手段１０は、各チャンネルごとのサンプル列に対して、信号の値が一定である平坦部を検出し、効率的に符号化する機能を有する。予測誤差変換手段２０は、線形予測誤差の手法を用いて、各サンプルの値を予測誤差値に変換する機能を有する。チャンネル間演算手段３０は、複数のチャンネルからなるサンプル列の各チャンネル間の差分演算を行う機能を有する。相関フレーム検出手段４０は、チャンネル間演算が行われた各サンプル列に対して、所定の区間をフレームとして設定した後、フレーム間で対応する全てのサンプル値が同一になっている相関フレームを検出し、時間的に後方に位置する相関フレームを削除する機能を有する。
【００１３】
データ分離手段５０は、必要に応じて各サンプルの正負の極性処理を行うと共に、予測誤差値で記録された誤差サンプル列を構成する各サンプルを、所定の位置で上位ビットである上位サンプルデータと下位ビットである下位サンプルデータに分離する機能を有する。上位サンプル符号化手段６０は、データ分離手段５０により分離された上位サンプル列を効率良く符号化する機能を有する。下位サンプル符号化手段７０は、データ分離手段５０により分離された下位サンプル列を効率良く符号化する機能を有する。図１に示した各構成要素は、実際には、コンピュータおよびコンピュータにより実行される専用のソフトウェアプログラムにより実現される。
【００１４】
（処理動作）
次に、図１に示した時系列信号の符号化装置の処理動作について説明する。ここでは、時系列信号として複数のチャンネルを有する音響信号の場合を例にとって説明する。まず、時系列信号であるアナログの音響信号をデジタル化する。これは、従来の一般的なＰＣＭの手法を用い、所定のサンプリング周波数でこのアナログ音響信号をサンプリングし、振幅を所定の量子化ビット数を用いてデジタルデータに変換する処理を行えば良い。本実施形態では、サンプリング周波数４４．１ＫＨｚ、量子化ビット数１６ビットで正負の符号を記録した場合を想定して以降説明する。サンプリング周波数４４．１ＫＨｚでサンプリングすると、１秒あたり４４１００個のサンプルにより構成されるサンプル列ができることになる。またここでは、音響信号が複数のチャンネルからなるので、各チャンネルごとにデジタル化が行われる。デジタル化された音響信号を模式的に示すと図２（ａ）のようになる。図２（ａ）は、２チャンネルのステレオ音響信号を示しており、Ｃｈ１にＬ（左）信号、Ｃｈ２にＲ（右）信号が記録されている。また、図２（ａ）から（ｄ）においては、左端が開始時刻であり、右端が終端時刻である。高さは各サンプルのビット数を示しており、本実施形態では、１６ビットとしている。なお、本装置の時系列信号入力手段１では、デジタル化後の音響信号を入力する。
【００１５】
（信号平坦部の処理）
このようにしてデジタル化されたデジタル音響信号であるサンプル列に対して、信号平坦部処理手段１０が、信号平坦部の処理を行う。信号平坦部とは、同一の信号レベルが連続する部分のことをいう。特に信号レベルが「０」の無音部、および信号レベルの絶対値が最大の飽和部に現れることが多い。無音部は実際に無音であるか、音が非常に小さく記録されなかった場合に生じるが、飽和部は、信号の録音およびＡ／Ｄ変換の過程において生じる。無音部、飽和部またはそれ以外の同一信号レベルが連続する場合のいずれであっても、信号平坦部は、同一の信号レベルが所定の時間（所定のサンプル数）連続して記録される。このため、この部分は圧縮し易いデータになっている。具体的には、信号平坦部の先頭時刻位置と、同一信号レベルが続くサンプルの個数と、信号レベル（サンプル値）の３つの値を信号平坦部データとして各チャンネルのサンプル列と分離して記録する。各チャンネルのサンプル列からは、信号平坦部が削除される。これを模式的に示すと図２（ｂ）（ｃ）に示すようになる。図２（ｂ）は、信号平坦部処理前のサンプル列である。図２（ｂ）において、網掛けで示した部分は信号平坦部を示す。信号平坦部処理手段１０の処理により、信号平坦部は元のサンプル列からは削除され、図２（ｃ）に示すようになる。ただし、復号時に元通りに復元するために、分離された信号平坦部は、信号平坦部データとして図２（ｅ）に示すような形式で記録しておく。
【００１６】
信号平坦部データは、上述のように、信号平坦部ごとに、その先頭時刻（サンプル番号）、サンプル数、サンプル値の３属性で記録する。ここで、先頭時刻とは、信号の開始位置からの時刻であり、図２（ｅ）の例では、先頭からのサンプル番号で記録している。このサンプル番号をサンプリング周波数で除算すれば、時刻に変換されることになる。サンプル数は、そのサンプル値がどの程度連続して続くかを示す情報である。なお、サンプル数の代わりに信号平坦部の終了時刻を記録するようにしても良い。サンプル値は、デジタル化された信号レベルを示している。本実施形態では、符号付き１６ビットで量子化しているので、最大値は「３２７６７」、最小値は「−３２７６８」となる。すなわち、「０」は無音部、「３２７６７」および「−３２７６８」は飽和部を示している。ただし、信号平坦部処理手段１０は、信号平坦部を無条件には処理しない。本発明は、データの圧縮を目的としているため、サンプル列の削減分よりも信号平坦部データが大きくなると意味がないからである。したがって、信号平坦部となるサンプルが所定数以上連続する場合に限り信号平坦部データを作成して各チャンネルのサンプル列から分離するのである。
【００１７】
（予測誤差への変換）
続いて、信号平坦部の処理が行われたサンプル列の各サンプルの値を、予測誤差変換手段２０が予測誤差値に変換する。あるサンプルにおける予測誤差値の算出は、時間的に過去に位置する直前の１つもしくは複数のサンプルの値を利用して行われる。本実施形態では、利用する直前のサンプル数を動的に変化させる手法を用いている。以下に、このような適応型線形予測符号化について説明する。予測誤差変換手段２０により行われる適応型線形予測符号化の処理概要を図３のフローチャートに示す。まず、あらかじめ準備された複数の予測計算式を用いて、各予測計算式に対応した線形予測誤差を算出する（ステップＳ１）。具体的には、サンプル番号ｔの予測誤差を算出する予測計算式として、以下の〔数式１〕〜〔数式４〕を用意している。
【００１８】
〔数式１〕
ｅ１（ｔ）＝ｘ（ｔ）−ｘ（ｔ−１）−ｅ１（ｔ−１）／２
【００１９】
〔数式２〕
ｅ２（ｔ）＝ｘ（ｔ）−２×ｘ（ｔ−１）＋ｘ（ｔ−２）−ｅ２（ｔ−１）／２
【００２０】
〔数式３〕
ｅ３（ｔ）＝ｘ（ｔ）−３×ｘ（ｔ−１）＋３×ｘ（ｔ−２）−ｘ（ｔ−３）−ｅ３（ｔ−１）／２
【００２１】
〔数式４〕
ｅ４（ｔ）＝ｘ（ｔ）−４×ｘ（ｔ−１）＋６×ｘ（ｔ−２）−４×ｘ（ｔ−３）＋ｘ（ｔ−４）−ｅ４（ｔ−１）／２
【００２２】
上記〔数式１〕〜〔数式４〕において、ｅ１（ｔ）〜ｅ４（ｔ）は各予測計算式による時刻ｔのサンプルにおける予測誤差であり、ｘ（ｔ）〜ｘ（ｔ−４）は時刻ｔ〜ｔ−４における振幅値である。
【００２３】
上記〔数式２〕における「２×ｘ（ｔ−１）−ｘ（ｔ−２）」、上記〔数式３〕における「３×ｘ（ｔ−１）−３×ｘ（ｔ−２）＋ｘ（ｔ−３）」、上記〔数式４〕における「４×ｘ（ｔ−１）−６×ｘ（ｔ−２）＋４×ｘ（ｔ−３）−ｘ（ｔ−４）」は過去の２〜４個のサンプルに基づく線形予測成分である。この線形予測成分、および、直前のサンプルにおいて算出された予測誤差「ｅ１（ｔ−１）／２」〜「ｅ４（ｔ−１）／２」（誤差フィードバック成分）を用いて時刻ｔにおける予測誤差ｅ１（ｔ）〜ｅ４（ｔ）を算出する。
【００２４】
続いて、上記各予測計算式別の予測誤差値の絶対値の累積である累積誤差が最小となる線形予測誤差をそのサンプルの予測誤差として選出する（ステップＳ２）。ここでは、累積誤差という考え方を用いている。具体的には、各予測計算式〔数式１〕〜〔数式４〕により算出された予測誤差の過去のサンプルについての累積値をＲ１〜Ｒ４として設定する。そして、この累積誤差Ｒ１〜Ｒ４のうち、最小となるものに対応する予測誤差を選出する。例えば、Ｒ１〜Ｒ４のうち、Ｒ２が最小であったとする。この場合、〔数式２〕で算出された予測誤差ｅ２（ｔ）を符号化対象とする予測誤差ｅ（ｔ）として選出することになる。選出された予測誤差ｅ（ｔ）はサンプルの元の値ｘ（ｔ）と置き換えられて以降処理が行われることになる。また、このとき用いられた予測式の次数をサンプル番号と対応付けて最適次数データとして記録する。「次数」とは、予測誤差の算出に過去いくつのサンプルを利用したかを示す数値であり、上記〔数式１〕〜〔数式４〕は１次〜４次に対応している。例えば、予測誤差ｅ２（ｔ）が予測誤差ｅ（ｔ）として選出された場合、次数は「２」となる。
【００２５】
続いて、累積誤差Ｒ１〜Ｒ４に各予測誤差ｅ１（ｔ）〜ｅ４（ｔ）の絶対値を加算する（ステップＳ３）。具体的には、以下の〔数式５〕に示すように、累積誤差値となる変数Ｒ１〜Ｒ４を更新していく。同時に、各サンプルの処理を行う度に、カウンタを１つづつ加算していく処理を行う。
【００２６】
〔数式５〕
Ｒ１←Ｒ１＋｜ｅ１（ｔ）｜
Ｒ２←Ｒ２＋｜ｅ２（ｔ）｜
Ｒ３←Ｒ３＋｜ｅ３（ｔ）｜
Ｒ４←Ｒ４＋｜ｅ４（ｔ）｜
【００２７】
続いて、カウンタが所定回数を超えたかどうかの判定を行う（ステップＳ４）。本実施形態では、この所定回数を１００回として設定している。すなわち、カウンタが１００を超えたかどうかの判定を行う。
【００２８】
この結果、カウンタが１００を超えていたら、累積誤差を半分にする（ステップＳ５）。具体的には、以下の〔数式６〕に示すように、累積誤差となる変数Ｒ１〜Ｒ４を２で除算する。同時に、カウンタを０にリセットする。すなわち、ここでのＲ１〜Ｒ４は純粋な意味での累積誤差ではなく、累積誤差の移動平均となっている。本実施形態では、直前の最大１００サンプルまでは累積されるが、それ以前のものは半分になるように処理する。これにより、時間的に離れたサンプルの影響が小さくなるようにしている。
【００２９】
〔数式６〕
Ｒ１←（Ｒ１）／２
Ｒ２←（Ｒ２）／２
Ｒ３←（Ｒ３）／２
Ｒ４←（Ｒ４）／２
【００３０】
上記ステップＳ１〜ステップＳ５の処理を時系列信号中の全時刻全サンプルに渡って実行することにより、全サンプルの値が元の振幅値ｘ（ｔ）から対象誤差ｅ（ｔ）に置き換えられることになる。ただし、予測誤差変換手段２０による処理は、各サンプルの値を変えるだけであるため、音響信号を模式的に示した状態は、図２（ｃ）に示した状態のままである。そのため、予測誤差変換手段２０による処理後のサンプルを、元のサンプルと区別するため、誤差サンプルと言うこともできる。
【００３１】
（チャンネル間演算）
次に、予測誤差値が記録された各チャンネルのサンプル列に対して、チャンネル間演算手段３０によりチャンネル間の差分演算が行われる。これは、同一時刻におけるサンプル値の差分を単純にとることにより行われる。差分演算の結果は、一方のチャンネルのサンプル列として与え、他方のチャンネルのサンプル列の値は、元のままとしておく。具体的には、図２（ｃ）に示すような２チャンネルのステレオ音響信号の場合Ｃｈ１にはＬ信号の値をそのまま記録しておき、Ｃｈ２にはＲ−Ｌの差分値を与える。一般に、ステレオ音響信号では、同一時刻におけるそれぞれのデータには相関があり、各時刻における両データの差分値は元の値に比べて小さな値となる。これは線形予測による予測符号後の値であっても同じである。そのため、図２（ｄ）の例では、Ｃｈ２における各サンプルの値が小さくなり、後に圧縮できる余地が大きくなる。
【００３２】
（相関フレーム検出）
続いて、チャンネル間演算が行われた各チャンネルのサンプル列に対して、相関フレーム検出手段４０が、所定の区間長をもつフレームを設定して、設定されたフレーム間の比較を行う。本実施形態では、フレーム長をサンプル列の開始時刻から終了時刻までの全区間に渡って固定長としている。具体的には、１フレームを５１２サンプルとしている。相関フレーム検出手段４０は、各チャンネルのサンプル列の先頭から５１２サンプルずつを１フレームとして設定し、フレーム間で全サンプルが一致する相関フレームを求めていくことになる。具体的な手順を図４のフローチャートに従って説明する。
【００３３】
まず、相関フレーム検出手段４０は、所定のサンプル数単位でフレーム化を行う（ステップＳ１１）。本実施形態では、上述のようにフレーム長をサンプル列の開始時刻から終了時刻までの全区間に渡って固定長５１２サンプルとしている。相関フレーム検出手段４０は、図５（ａ）に示すように、サンプル列の先頭から５１２サンプルずつを１フレームとして設定していくことになる。
【００３４】
次に、各フレームに対して構成するサンプル値が全て一致するフレームを探索する。具体的には、図５（ｂ）に示すように、まず、設定されたフレームのうち、時間的に最後尾のフレームを、相関フレームを探すための対象フレームとする。次に、所定の探索範囲内において、対象フレームの先頭サンプルの値と同一の値をもつサンプルを、時間的に遡りながら探索していく（ステップＳ１２）。例えば、図６（ａ）に示すように、対象フレームがｋＴ〜ｋＴ＋５１１の５１２個のサンプルで構成されているとする。この場合、まず、対象フレームの先頭サンプルｋＴのサンプル値ｅ（ｋＴ）と同一となるサンプルを探索していく。サンプルｋＴ−１、サンプルｋＴ−２と順に探索していく。なお、図６において、ｋは先頭からｋ番目のフレームであることを示し、Ｔはフレーム長（本実施形態では５１２サンプル）を示している。
【００３５】
一致するサンプルｔが見つかったら（ステップＳ１３）、次に、そのサンプルｔの次のサンプルｔ＋１と対象フレームの２番目のサンプルｋＴ＋１が一致するかどうかを比較する。このようにしてサンプルの値が一致する限り後続するサンプル同士の比較を行っていく（ステップＳ１４）。ステップＳ１４においては、ｅ（ｔ＋ｐ）とｅ（ｋＴ＋ｐ）の値が一致する限り、処理を繰り返していく。例えば、図６（ｂ）に示す例では、ｅ（ｔ）〜ｅ（ｔ＋８）がｅ（ｋＴ）〜ｅ（ｋＴ＋８）と一致しているので、さらにｐ＝９として、ステップＳ１４の処理が続けられることになる。ｐ＝０〜ｐ＝５１１までの全てのｅ（ｔ＋ｐ）とｅ（ｋＴ＋ｐ）が一致した場合（ステップＳ１５）、そのサンプル列を対象フレームに対する相関フレームとし、相関フレームの先頭のサンプル番号と対象フレームの先頭のサンプル番号とを対応付けてフレーム相関データとして記録し、対象フレームを元のサンプル列から削除する（ステップＳ１６）。対象フレームの全サンプルと一致しなければ、さらに対象フレームの先頭サンプルと値が一致するサンプルが存在するかどうかを時間的に遡りながら探索していく。所定のサンプル数分遡っても一致する相関フレームが存在しない場合は、その対象フレームに関する相関フレームの探索を中止し、対象フレームの直前のフレームを新たな対象フレームとして相関フレームの探索を行う。１つの対象フレームに対しての処理が終わったら、ステップＳ１２に戻って、１つ直前のフレームを新たな対象フレームとして処理を続けていく（ステップＳ１７）。このようにして、時系列信号の先頭時刻近辺に位置するフレームを除く全フレームを対象フレームとして相関フレームの検出処理を行う。
【００３６】
時系列信号のサンプル列全体でみると、図５（ｃ）に示すように対象フレームに対応する相関フレームが検出されたとすると、図５（ｄ）に示すように対象フレームが削除されることになる。このとき、復号時に完全に復元できるように図５（ｅ）に示すようなフレーム相関データが記録される。図５（ｅ）に示すように、フレーム相関データには対象フレームの先頭のサンプル番号と相関フレームの先頭のサンプル番号が対応づけて記録される。
【００３７】
（上位ビットと下位ビットの分離）
続いて、データ分離手段５０が、各サンプルの上位ビットと下位ビットの分離を行う。実際に、分離を行う前に前処理として、正負の値をとる各サンプルの値を、正負の極性が付いたビット列に変換する。具体的には、１６ビットで正負の値を表現しているビット列を、先頭の１ビットを正負の極性符号とし、他の１５ビットで絶対値を表すように変換する。このように変換した場合、「０」については、極性符号が必要ないため、省略が可能となる。これにより、値が「０」のサンプル数×１ビット分が削減できることになる。
【００３８】
極性処理が行われたら、次に、データ分離手段５０は、各サンプルの上位ビットと下位ビットの分離を実行する。例えば、音響信号をＰＣＭによりデジタル化する際に、量子化ビット数１６でサンプリングした場合、各サンプルは１６ビットで表現されている。この場合、本実施形態では、上位ビット１２ビットと、下位ビット４ビットに分離する。この分離は、基本的に、Ａ／Ｄ変換機等、音響信号をデジタル化する際に用いる回路の熱雑音を分離するために行う。そのため、熱雑音であると考えられる下位ビットを分離するのである。下位ビットとして、どの程度分離するかは、音源や利用した回路の特性によっても変化するが、通常量子化ビット数の１／４程度とすることが望ましい。したがって、ここでは、１６ビットの１／４にあたる４ビットを下位ビットとして分離しているのである。本発明においては、特に、この上位ビットと下位ビットの分離を予測誤差に変換した後に行うことを特徴としている。これは、予測誤差への変換を上位ビットと下位ビットの分離後に上位サンプルに対して行うと、たとえ予測誤差への変換により圧縮可能な成分が下位ビットのなかに含まれていても、圧縮処理が行われないため、全体的に圧縮効率が低下する場合があるためである。
【００３９】
ここで、データ分離手段５０によるデータ分離の様子を図７に模式的に示す。図７において、Ｈは上位ビットもしくは上位サンプルデータを示し、Ｌは下位ビットもしくは下位サンプルデータを示す。図７（ａ）は分離前のサンプルデータである。データ分離手段５０により、サンプルデータは、図７（ｂ）に示す上位サンプルデータと図７（ｃ）に示す下位サンプルデータに分離されることになる。なお、上位ビットに含まれる符号ビットは、そのまま上位サンプルデータに含まれて分離される。図７の例で、「Ｈ４」として示したように、前処理により符号ビットが削除されている場合には、符号ビットのない上位サンプルデータとなる。上記のようにして分離されたサンプルデータは、以降別々に処理されることになる。
【００４０】
（上位サンプルの符号化）
次に、上位サンプル符号化手段６０が、分離された上位サンプルの符号化を行う。まず、各チャンネルの上位サンプル列に対して、信号平坦部の処理を行う。上位サンプル符号化手段６０が行う信号平坦部処理は、信号平坦部処理手段１０が行った処理と全く同じである。すなわち、上位サンプル列中で同一の信号レベルが連続する部分を、信号平坦部の先頭時刻位置と、同一信号レベルが続くサンプルの個数と、信号レベル（サンプル値）の３つの値で構成される上位信号平坦部データとして、各チャンネルの上位サンプル列と分離して記録する。上位信号平坦部データは、図２（ｅ）に示した信号平坦部データと同様の形式で記録される。
【００４１】
続いて、上位サンプル符号化手段６０が、固定長の上位サンプル列を可変長に変換する。まず、ビット構成の変換を行うために利用するルックアップテーブルの作成を行う。ルックアップテーブルの作成にあたって、上位サンプル列の全時刻に渡って、各上位サンプル値のヒストグラムを算出する。各上位サンプル値は上記データ分離手段５０により、全て絶対値化されているので、正負の区別なくヒストグラムを算出する。その結果、サンプル絶対値の種類が６４０以上となった場合、セパレータビットを２ビット固定値「００」とし、サンプル絶対値の種類が６３９以下となった場合、セパレータビットを１ビット固定値「０」とする。さらに、出現頻度の高いサンプル絶対値から順に、少ないビット数のビットパターンを割り当てていく。この際、割り当てるビットパターンには規則が有り、最上位ビットは必ず「１」とすると共に、セパレータビットが２ビット「００」の場合は「００１」のビットパターンを含むビットパターンは禁止し、セパレータビットが１ビット「０」の場合は「０１」のビットパターンを含むビットパターンは禁止する。また、セパレータビットが２ビット「００」の場合のルックアップテーブルは１つだけであるが、セパレータビットが１ビット「０」の場合のルックアップテーブルは、サンプル絶対値の種類が３２０以上の場合と、３２０未満の場合で異なるものを作成するようにしている。サンプル絶対値の種類の数に応じたルックアップテーブルの例を図８、図９に示す。
【００４２】
上記のようにして作成されたルックアップテーブルを用いて、上位サンプル符号化手段６０は、１２ビット固定長の連続する上位サンプルデータを、可変長のビットパターンに変換していく。可変長になるため、変換後の各データの区切りを区別する必要が生じる。そのため、本実施形態では、各データ間に上述のような１ビットもしくは２ビットのセパレータビットを挿入する。サンプル値の種類が３２０未満の場合、各順位のデータを表現するためのビット列、およびビット数は、図８（ａ）に示すようになる。図８（ａ）において、順位０位は、最もビット数が少ない１ビット「１」で表現される。図８（ａ）においては、変換前ビット列は省略してあるが、最も頻繁に現れるビット列が１ビット「１」に変換されることになる。また、各可変長ビットには、セパレータが必ず付加されるので、順位０位のデータを表現するためには、２ビットが必要となることになる。図８（ａ）に示すサンプル値の種類が３２０未満の場合は、セパレータビットが１ビット「０」であるため、「０１」のビットパターンは割り当てられないことになる。
【００４３】
また、サンプル値の種類が３２０以上６４０未満の場合、各順位のデータを表現するためのビット列、およびビット数は、図８（ｂ）に示すようになる。図８（ｂ）は、図８（ａ）に示したルックアップテーブルの各ビット列の最上位１ビットに後続して１ビットを付加したものを新たなビット列としている。例えば、図８（ｂ）において順位０位の「１０」と順位１位の「１１」は、図８（ａ）において順位０位の「１」に１ビット「０」と「１」をそれぞれ付加したものであり、図８（ｂ）において順位２位の「１００」と順位３位の「１１０」は、図８（ａ）において順位１位の「１０」の２ビット目に１ビット「０」と「１」をそれぞれ付加したものである。図８（ｂ）においても。各可変長ビットには、セパレータが必ず付加されるので、順位０位のデータを表現するためには、３ビットが必要となることになる。図８（ｂ）の例では、セパレータビットが１ビット「０」であるため、「０１」のビットパターンは割り当てられないことになるが、データの読出しの順序を工夫することにより復号時には正しいデータが抽出できるようになっている。
【００４４】
また、セパレータビットが２ビット「００」の場合、各順位のデータを表現するためのビット列、およびビット数は、図９に示すようになる。図９において、順位０位は、最もビット数が少ない１ビット「１」で表現される。図９においても、変換前ビット列は省略してあるが、最も頻繁に現れるビット列が１ビット「１」に変換されることになる。また、各可変長ビットには、セパレータが必ず付加されるので、順位０位のデータを表現するためには、３ビットが必要となることになる。図９の例では、セパレータビットが２ビット「００」であるため、「００１」のビットパターンは割り当てられないことになる。
【００４５】
図１０（ａ）（ｂ）に、上位サンプル符号化手段６０によるデータ変換の様子を模式的に示す。図１０（ａ）（ｂ）はいずれもサンプル列の上位部分に対応しており、図１０（ａ）は固定長の上位サンプルが連続して記録されている様子を示している。図１０（ａ）に示したような上位サンプル列は、図８（ａ）（ｂ）および図９に示したルックアップテーブルを用いて図１０（ｂ）に示すように変換されることになる。
【００４６】
（下位サンプルの符号化）
一方、下位サンプルデータは、下位サンプル符号化手段７０により処理される。具体的には、データ分離手段５０により分離された下位２ビットのデータを連続に配置していく。
【００４７】
（符号データの記録）
以上のようにして得られた符号データは、図１１に示すようになる。すなわち、上位可変長サンプル列、ルックアップテーブル、上位信号平坦部データ、下位固定長サンプル列、フレーム相関データ、信号平坦部データ、チャンネル間データとなる。これらのデータはその符号化過程において記憶手段２に記憶されているので、このデータを記録すべき記録媒体に合わせたフォーマットで記録する。
【００４８】
（符号データの分析）
符号データは、分析手段３により分析される。分析手段３の処理について図１２のフローチャートを用いて説明する。まず、量子化雑音成分のデータ量を算出する（ステップＳ２１）。これは、符号データ中の下位固定長サンプル列のデータ量を計測することにより算出される。本装置における圧縮では、もともと下位の所定のビット数を量子化成分として分離し、それを固定長で符号化しているため、この下位固定長サンプル列のデータ量が原デジタル音響信号の量子化雑音成分であると推測できるのである。次に、フレーム相関データのデータ量を算出する（ステップＳ２２）。これは、フレーム相関データのデータ量を計測することにより行う。続いて、元のサンプル列から削除された対象フレームのデータ量を算出する（ステップＳ２３）。これは、フレーム相関データの内容から削除された対象フレームのデータ量を算出することにより行う。具体的には、フレーム相関データ内の対象フレームにフレーム長であるサンプル数（本実施形態では５１２サンプル）および各サンプルのビット数（本実施形態では１６ビット）を乗じることにより算出する。
【００４９】
次に、信号平坦部データのデータ量を算出する（ステップＳ２４）。これは、符号データ中の信号平坦部データのデータ量を計測することにより行われる。続いて、原デジタル音響信号の信号平坦部のデータ量を算出する（ステップＳ２５）。これは、信号平坦部データの内容から元の信号平坦部のデータ量を算出する。具体的には、信号平坦部データ内のサンプル数に必要なビット数（本実施形態では１６ビット）を乗じることにより算出する。
【００５０】
次に、上位可変長サンプル列のチャンネル別のデータ量を算出する。（ステップＳ２６）。これは、符号データ中の上位可変長サンプル列のチャンネルごとのデータ量を計測することにより行われる。続いて、原デジタル音響信号の線形予測対象成分のデータ量を算出する（ステップＳ２７）。これは、原デジタル音響信号のデータ量から、上記ステップＳ２１において算出した量子化雑音成分のデータ量、上記ステップＳ２３において算出した削除フレームのデータ量、および上記ステップＳ２５において算出した原信号平坦部のデータ量を減じることにより算出される。各チャンネルにおけるデータ量は同じであるため、さらにチャンネル数で除算することにより各チャンネル別の線形予測対象成分が算出される。最後に、算出した各データの原音響信号に対する割合を算出する（ステップＳ２８）。
【００５１】
分析手段３により分析された情報は、表示手段４に表示される。ここで、このときの表示画面の様子を図１３に示す。図１３において、上段は圧縮前の原音響信号の構成比率を示したものであり、下段は圧縮後の符号データの構成比率を示したものである。図１３に示した各構成データは実際には色分けされて表示される。また、各構成データと共に、各構成データの原デジタル音響信号に対する割合が百分率で％表示される。各構成データの割合は、圧縮後の各構成データについても、原デジタル音響信号に対する割合で算出され、表示される。また、上段と下段において、対応するデータは同色で表示する。例えば、線形予測符号化対象成分Ｌは、予測符号化圧縮成分Ｌと同色で表示する。なお、図１３の各データにおけるＬ、Ｒはチャンネルを示している。図１３の例では、２チャンネルのステレオ音響信号を対象として行ったため、２チャンネル分が示されている。図１３の例では、全体としてデータ量が５０％程度に圧縮されており、特に、予測符号化対象成分、信号平坦部、削除フレームの部分が圧縮率に大きく貢献していることがわかる。
【００５２】
（最適次数の出力）
また、分析手段３は、予測誤差変換手段２０により記録された最適次数データを所定の表示形式に変換して表示手段４に表示させる。このときの表示手段４の画面の様子を図１４に示す。図１４において、横軸は時刻、縦軸は次数である。図１４に示したような形式で表示させることにより、どのような予測式を用いれば最適な圧縮を行うことができるかの参考になる。
【００５３】
（フレーム相関の出力）
また、分析手段３は、相関フレーム検出手段４０により記録されたフレーム相関データを所定の表示形式に変換して表示手段４に表示させる。このときの表示手段４の画面の様子を図１５に示す。図１５において、上段下段共に時系列のサンプル列を矩形で示している。また、横軸は時刻であり、矩形の左端は開始時刻、右端は終了時刻を示している。上段に示したサンプル列中の上下方向の線分は相関フレーム、下段に示したサンプル列中の上下方向の太い線分は対象フレームを示している。上段のサンプル列も下段のサンプル列も同じものを示しているが、分けて表示しているのは、対象フレームと相関フレームの関係をわかりやすく示すためである。対応する相関フレームと対象フレームは点線で結んで示している。図１５の例では、１１個の対象フレームに対して１１個の相関フレームが検出されたことを示している。図１５からわかるように、相関フレームは必ず対象フレームよりも時間的に過去のものになっている。図１５に示すような分析データを可視情報として出力することにより、その時系列信号にどの程度の相関があるか等の情報を得ることができる。効果的な圧縮を検討するのに役立つ。
【００５４】
（予測誤差成分の音声出力）
上記のような視覚的な分析とは別に、本装置では、符号データの一部、あるいは符号データの作成過程において生じるデータを、音響信号として出力する機能も有している。予測誤差変換手段２０により得られたサンプル列を音響信号変換手段により変換した後出力すると、予測誤差成分を音声として出力することができる。同時に、その信号波形を表示手段４により出力する。これにより、本来は予測誤差成分であるデータを新たな音響信号とする特殊効果的な音響信号が得られる。例えば、図１６に示すような波形を示すＰＣＭ音響信号に対して処理を行うと、図１７に示すような波形の予測誤差成分が得られる。
【００５５】
（上位予測誤差成分の音声出力）
また、上位可変長サンプル列の生成過程において生じる上位固定長サンプル列を音響信号変換手段により変換した後出力すると、予測誤差成分の上位ビット成分を音声として出力することができる。同時に、その信号波形を表示手段４により出力する。これにより、本来は予測誤差成分の主成分であるデータを新たな音響信号とする特殊効果的な音響信号が得られる。例えば、図１６に示すような波形を示すＰＣＭ音響信号に対して処理を行うと、図１８に示すような波形の上位予測誤差成分が得られる。
【００５６】
（下位予測誤差成分の音声出力）
また、下位固定長サンプル列を音響信号変換手段により変換した後出力すると、予測誤差成分の下位ビット成分を音声として出力することができる。同時に、その信号波形を表示手段４により出力する。これにより、本来は予測誤差成分の量子化雑音成分であるデータを新たな音響信号とする特殊効果的な音響信号が得られる。例えば、図１６に示すような波形を示すＰＣＭ音響信号に対して処理を行うと、図１９に示すような波形の下位予測誤差成分が得られる。
【００５７】
（復号）
次に、上記符号化装置により符号化された符号データの復号について説明する。図２０は、本発明に係る時系列信号の復号装置の構成を示す機能ブロック図である。図２０において、９１はデータ読込手段、９２は上位サンプル変換手段、９３はデータ統合手段、９４はフレーム復元手段、９５はチャンネル復元手段、９６は独立サンプル復元手段、９７は信号平坦部挿入手段である。図２０に示す構成は、コンピュータおよびコンピュータに搭載される専用のソフトウェアプログラムにより実現される。
【００５８】
続いて、図２０に示した復号装置の処理動作について説明する。まず、図１１に示したような符号データを記録した記録媒体を、データ読込手段９１が読み込む。データ読込手段９１は、読み込んだデータのうち、上位可変長サンプル列とルックアップテーブルを、上位サンプル変換手段９２に渡す。上位サンプル変換手段９２では、ルックアップテーブルを参照することにより、上位可変長サンプル列から、１２ビット（値が「０」のものについては１１ビット）固定長の上位固定長サンプル列を復元してゆく。この際、ルックアップテーブルが図８（ａ）もしくは図９に示したものである場合には、上位可変長サンプル列のビットデータを順番に読み込んで復元していけば問題ないが、図８（ｂ）に示したようなルックアップテーブルである場合には、変換時に工夫が必要となる。この場合、セパレータビットが１ビット「０」であるため、「０１」のビットパターンは本来禁止されるはずであるが、図８（ｂ）に示すように、変換後ビット列には、「０１」のビットパターンを含むものがある。そこで、本実施形態では、ビットパターンの書き込み順序を変更することで対応している。具体的には、図８（ａ）または図９の場合、常に１となる先頭ビットを最後に書き込むようにし、２ビット目から書き込むようにし、図８（ｂ）の場合、１および２ビット目を最後に書き込むようにし、３ビット目から書き込むようにしている。例えば、順位４位のビット列「１０１」は「０１」のビットパターンを含むが、このようなビット列の場合、まず３ビット目の「１」から読み込まれ、セパレータビットと第１ビットから構成される「０１」パターンを認識して、２ビット目が最後に読まれることになるため、セパレータの誤認識が生じない。この場合、上位サンプル変換手段９２は「１０１」のビット列を認識し、ルックアップテーブルに従って元の固定長ビット列が復元できる。
【００５９】
さらに、上位サンプル変換手段９２は読み込んだ上位信号平坦部データを上位固定長サンプル列の所定の位置に挿入していく。続いて、データ統合手段９３が上位固定長サンプル列と下位固定長サンプル列を統合する。具体的には、上位固定長サンプル列から１２ビットを抽出し、下位固定長サンプル列から４ビットを抽出して順次統合する処理を行う。さらに、続いて、データ統合手段９３は、正負の正負極性部１ビットと数値部１５ビットで表現されたサンプル列を正負の数値をとる１６ビットに変換する。
【００６０】
フレーム復元手段９４は、このようなサンプル列に対して、フレーム相関データで定義されている相関フレームに対応する区間のサンプル列と同一のサンプル列をもつ区間を、フレーム相関データで定義されている対象フレームのアドレス位置に挿入することにより、フレームを復元する。この結果、図２（ｄ）に示すようなサンプル列が復元される。さらに、チャンネル復元手段９５がチャンネル間情報を用いて、どのチャンネルのサンプル列が元のままであるか、どのチャンネルのサンプル列がどのチャンネルのサンプル列との差分情報となっているかを認識して、サンプル列を復元する。この時点で各サンプルの値は前１つから４つまでのいずれかの個数のサンプル値に基づく予測誤差で記録されているので、独立サンプル復元手段９６が、上記〔数式１〕〜〔数式４〕の左辺の項と右辺第１項を交換した式に基づいて、元のサンプル値ｘ（ｔ）を順次復元してゆく。最後に、信号平坦部挿入手段９７は、図２（ｅ）に示したような信号平坦部データを用いて、図２（ｂ）に示すようにサンプル列の所定の位置に信号平坦部を挿入する。これにより、アナログ信号をＰＣＭ化した状態のデジタル音響信号がデータの欠落無く復元されることになる。
【００６１】
【発明の効果】
以上、説明したように本発明によれば、時系列のサンプル列で構成される時系列信号に対して、サンプル列の各サンプルの値を、時間的に過去の複数のサンプルからの予測誤差値に変換し、予測誤差値に変換された各サンプル値を表現するビットデータを分断する位置を設定し、設定されたビット位置で分断し、上位ビットのサンプル列で構成される上位サンプル列と、下位ビットのサンプル列で構成される下位サンプル列とに分離し、上位サンプル列に対しては、可変長符号で符号化を行い、下位サンプル列に対しては、固定長符号で符号化を行、時系列信号の中で上位サンプル列に対応するデータの割合と、下位サンプル列に対応するデータの割合と、上位サンプルの符号化で符号化されたデータの割合と、下位サンプルの符号化で符号化されたデータの割合を表示するようにしたので、時系列信号を効率的に可逆圧縮すると共に、圧縮されたデータの圧縮精度を解析することが可能となるという効果を奏する。
【図面の簡単な説明】
【図１】本発明に係る時系列信号の符号化装置の一実施形態を示す機能ブロック図である。
【図２】信号平坦部処理手段１０およびチャンネル間演算手段３０による処理の様子を示す図である。
【図３】予測誤差変換手段２０による予測誤差算出処理の様子を示す図である。
【図４】フレーム間演算手段４０による処理を示すフローチャートである。
【図５】フレーム間演算手段４０の処理による時系列信号全体の様子を示す図である。
【図６】フレーム間演算手段４０の処理により比較されるサンプルの様子を示す図である。
【図７】データ分離手段５０による処理の様子を示す図である。
【図８】サンプル絶対値の種類が６４０未満の場合のルックアップテーブルの一例を示す図である。
【図９】サンプル絶対値の種類が６４０以上の場合のルックアップテーブルの一例を示す図である。
【図１０】上位サンプルのビット長の変換を模式的に示す図である。
【図１１】本発明に係る時系列信号の符号化装置により得られる符号データを示す図である。
【図１２】分析手段３による処理の様子を示すフローチャートである。
【図１３】分析手段３により処理された各データ割合の表示例を示す図である。
【図１４】表示手段４に表示された最適次数データを示す図である。
【図１５】表示手段４に表示されたフレーム相関を示す図である。
【図１６】原音響信号の波形を示す図である。
【図１７】図１６の原音響信号の処理により得られる予測誤差成分の波形を示す図である。
【図１８】図１６の原音響信号の処理により得られる上位予測誤差成分の波形を示す図である。
【図１９】図１６の原音響信号の処理により得られる下位予測誤差成分の波形を示す図である。
【図２０】時系列信号の復号装置を示す機能ブロック図である。
【符号の説明】
１・・・時系列信号入力手段
２・・・記憶手段
３・・・分析手段
４・・・表示手段
５・・・分離位置手段
６・・・音響信号変換手段
７・・・音声出力手段
１０・・・信号平坦部処理手段
２０・・・予測誤差変換手段
３０・・・チャンネル間演算手段
４０・・・フレーム間演算手段
５０・・・データ分離手段
６０・・・上位サンプル符号化手段
７０・・・下位サンプル符号化手段
９１・・・データ読込手段
９２・・・上位サンプル変換手段
９３・・・データ統合手段
９４・・・フレーム復元手段
９５・・・チャンネル復元手段
９６・・・独立サンプル復元手段
９７・・・信号平坦部挿入手段[0001]
[Industrial application fields]
The present invention relates to the field of music production such as music production, storage of acoustic data materials, relay of location materials, especially the field of special effects by analysis / classification and processing of acoustic signals, analysis / diagnosis of biological signals in telemedicine, etc. The present invention relates to a suitable data compression analysis technique.
[0002]
[Prior art]
Conventionally, various methods are used for compression of an acoustic signal. As a method for compressing and encoding an acoustic signal, MP3 (MPEG-1 / Layer3), AAC (MPEG-2 / Layer3), and the like have been put into practical use. Such a compression encoding method makes it possible to handle an acoustic signal as small data, and contributes to the efficiency of data recording and transmission.
[0003]
MP3, AAC, and the like as described above are all referred to as lossy encoding methods, and can be efficiently compressed. However, in decoding, the original signal is completely reproduced with a considerable quality degradation. It is not possible. For this reason, in the music production field such as music production, material storage, and location material relay, these encoding methods cannot be applied, and although inefficient, a method for storing and transmitting without compression is employed. In particular, recently, the production of high-definition audio has increased, the material capacity has become enormous, and it has become a problem in managing work disks.
[0004]
Recently, in order to solve the above problems, various methods have been proposed for lossless compression coding of acoustic signals (see, for example, Patent Document 1).
[0005]
[Patent Document 1]
JP 2002-278600 A
[0006]
[Problems to be solved by the invention]
However, at present, it is possible to measure how much the original data has been compressed, but it is impossible to know how much of which component of the original signal has been compressed.
[0007]
Therefore, in order to solve such a problem, the present invention provides a time-series signal compression analysis apparatus and conversion capable of efficiently reversibly compressing a time-series signal and analyzing the compression accuracy of the compressed data. It is an object to provide an apparatus.
[0008]
[Means for Solving the Problems]
In order to solve the above problems, in the present invention, for a time-series signal composed of time-series sample sequences, an apparatus for compressing an information amount and analyzing the compressed information so that all the sample sequences can be reproduced A prediction error converting means for converting the value of each sample in the sample sequence into a prediction error value from a plurality of samples in the past in time, bit data representing each sample value converted to the prediction error value A data separation means for setting a position to be divided, dividing at a set bit position, and separating the upper sample string composed of the upper bit sample string and the lower sample string composed of the lower bit sample string; The upper sample sequence is encoded with a variable length code for the upper sample sequence, and the fixed sample code is encoded for the lower sample sequence. The lower sample encoding means, the ratio of data corresponding to the upper sample string in the time series signal, the ratio of data corresponding to the lower sample string, and the upper sample encoding means The present invention is characterized in that a data display means for displaying the data ratio and the ratio of the data encoded by the lower sample encoding means is provided.
[0009]
According to the present invention, the time series signal is subjected to prediction error conversion, the upper and lower bits are separated and compressed, the ratio of each data included in the compressed code data is displayed, and each data before compression is displayed. Since the ratio is analyzed and displayed, the time-series signal can be efficiently and reversibly compressed, and the compression accuracy of the compressed data can be analyzed.
[0010]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
(Device configuration)
FIG. 1 is a block diagram showing an embodiment of a time-series signal encoding apparatus according to the present invention. In FIG. 1, 1 is time series signal input means, 2 is storage means, 3 is analysis means, 4 is display means, 5 is separation position setting means, 6 is acoustic signal conversion means, 7 is audio output means, and 10 is signal. The flat part processing means, 20 is a prediction error converting means, 30 is an inter-channel calculating means, 40 is a correlation frame detecting means, 50 is a data separating means, 60 is an upper sample encoding means, and 70 is a lower sample encoding means.
[0011]
In FIG. 1, a time-series signal input means 1 has a function of inputting a digitized acoustic signal such as a digital acoustic signal. The storage means 2 has a function of storing various data created by this apparatus. The analyzing means 3 has a function of analyzing each created data and creating data indicating the compression efficiency of this apparatus. The display unit 4 has a function of displaying the data analyzed by the analysis unit 3. The separation position setting unit 5 has a function of setting the separation position for the data separation unit 50. The acoustic signal conversion means 6 has a function of converting various data stored in the storage means into a state in which each sample can be converted into a fixed bit number, then D / A converted, amplified and output as an acoustic signal. Yes. The audio output means 7 has a function of outputting the data converted into the analog sound signal as sound. Specifically, the audio output means 7 is realized by a speaker.
[0012]
The signal flat part processing means 10 has a function of detecting a flat part having a constant signal value and efficiently encoding the sample sequence for each channel. The prediction error conversion means 20 has a function of converting the value of each sample into a prediction error value using a linear prediction error method. The inter-channel calculation means 30 has a function of performing a difference calculation between each channel of a sample row composed of a plurality of channels. Correlation frame detection means 40 detects a correlation frame in which all sample values corresponding to each frame are the same after setting a predetermined section as a frame for each sample sequence on which inter-channel computation has been performed. And has a function of deleting a correlation frame located backward in time.
[0013]
The data separation means 50 performs positive / negative polarity processing of each sample as necessary, and converts each sample constituting the error sample sequence recorded with the prediction error value into upper sample data that is upper bits at a predetermined position. It has a function of separating into lower sample data which are lower bits. The upper sample encoding unit 60 has a function of efficiently encoding the upper sample sequence separated by the data separation unit 50. The lower sample encoding unit 70 has a function of efficiently encoding the lower sample sequence separated by the data separation unit 50. Each component shown in FIG. 1 is actually realized by a computer and a dedicated software program executed by the computer.
[0014]
(Processing operation)
Next, the processing operation of the time-series signal encoding apparatus shown in FIG. 1 will be described. Here, a case of an acoustic signal having a plurality of channels as a time series signal will be described as an example. First, an analog acoustic signal that is a time-series signal is digitized. This may be performed by using a conventional general PCM method, sampling the analog acoustic signal at a predetermined sampling frequency, and converting the amplitude into digital data using a predetermined number of quantization bits. In the present embodiment, the following description will be given on the assumption that a positive / negative code is recorded with a sampling frequency of 44.1 KHz and a quantization bit number of 16 bits. When sampling is performed at a sampling frequency of 44.1 KHz, a sample string composed of 44100 samples per second is formed. Here, since the acoustic signal is composed of a plurality of channels, digitization is performed for each channel. A digitalized acoustic signal is schematically shown in FIG. FIG. 2A shows a two-channel stereo sound signal in which an L (left) signal is recorded in Ch1 and an R (right) signal is recorded in Ch2. In FIGS. 2A to 2D, the left end is the start time, and the right end is the end time. The height indicates the number of bits of each sample. In this embodiment, the height is 16 bits. Note that the time-series signal input means 1 of the present apparatus inputs the digitized acoustic signal.
[0015]
(Processing of signal flat part)
The signal flat part processing means 10 performs processing of the signal flat part on the sample sequence which is the digital sound signal digitized in this way. The signal flat portion refers to a portion where the same signal level continues. In particular, it often appears in a silent portion where the signal level is “0” and a saturated portion where the absolute value of the signal level is maximum. The silence part occurs when the sound is actually silent or when the sound is not recorded very low, but the saturation part occurs during the process of recording the signal and A / D conversion. Regardless of whether the same signal level is continuous in the silent part, the saturated part, or otherwise, the signal flat part continuously records the same signal level for a predetermined time (a predetermined number of samples). For this reason, this portion is easily compressed data. Specifically, three values of the start time position of the signal flat portion, the number of samples with the same signal level, and the signal level (sample value) are separated from the sample sequence of each channel as signal flat portion data and recorded. To do. The signal flat portion is deleted from the sample sequence of each channel. This is schematically shown in FIGS. 2B and 2C. FIG. 2B shows a sample string before the signal flat part processing. In FIG. 2B, a shaded portion indicates a signal flat portion. By the processing of the signal flat part processing means 10, the signal flat part is deleted from the original sample sequence, and becomes as shown in FIG. However, in order to restore to the original state at the time of decoding, the separated signal flat part is recorded as signal flat part data in a format as shown in FIG.
[0016]
As described above, the signal flat portion data is recorded for each signal flat portion with the three attributes of the start time (sample number), the number of samples, and the sample value. Here, the head time is the time from the start position of the signal, and in the example of FIG. 2 (e), it is recorded with the sample number from the head. When this sample number is divided by the sampling frequency, it is converted into time. The number of samples is information indicating how long the sample value continues. Note that the end time of the signal flat portion may be recorded instead of the number of samples. The sample value indicates the digitized signal level. In this embodiment, since quantization is performed with signed 16 bits, the maximum value is “32767” and the minimum value is “−32768”. That is, “0” indicates a silent portion, and “32767” and “−32768” indicate a saturated portion. However, the signal flat part processing means 10 does not process the signal flat part unconditionally. Since the present invention aims at data compression, it is meaningless if the signal flat portion data becomes larger than the reduction amount of the sample string. Therefore, the signal flat portion data is generated and separated from the sample sequence of each channel only when a predetermined number or more of samples serving as the signal flat portion are continuous.
[0017]
(Conversion to prediction error)
Subsequently, the prediction error conversion means 20 converts the value of each sample of the sample sequence that has been subjected to the signal flat part processing into a prediction error value. The calculation of the prediction error value in a certain sample is performed using the values of one or a plurality of samples immediately before being located in the past in time. In the present embodiment, a method of dynamically changing the number of samples immediately before use is used. Hereinafter, such adaptive linear predictive coding will be described. A processing outline of adaptive linear prediction encoding performed by the prediction error conversion means 20 is shown in the flowchart of FIG. First, a linear prediction error corresponding to each prediction calculation formula is calculated using a plurality of prediction calculation formulas prepared in advance (step S1). Specifically, the following [Formula 1] to [Formula 4] are prepared as prediction calculation formulas for calculating the prediction error of the sample number t.
[0018]
[Formula 1]
e1 (t) = x (t) -x (t-1) -e1 (t-1) / 2
[0019]
[Formula 2]
e2 (t) = x (t) -2 * x (t-1) + x (t-2) -e2 (t-1) / 2
[0020]
[Formula 3]
e3 (t) = x (t) −3 × x (t−1) + 3 × x (t−2) −x (t−3) −e3 (t−1) / 2
[0021]
[Formula 4]
e4 (t) = x (t) -4 * x (t-1) + 6 * x (t-2) -4 * x (t-3) + x (t-4) -e4 (t-1) / 2
[0022]
In the above [Equation 1] to [Equation 4], e1 (t) to e4 (t) are prediction errors in the sample at time t according to each prediction calculation formula, and x (t) to x (t-4) are times. It is an amplitude value from t to t-4.
[0023]
“2 × x (t−1) −x (t−2)” in the above [Expression 2], “3 × x (t−1) −3 × x (t−2) + x ( t-3) ”,“ 4 × x (t−1) −6 × x (t−2) + 4 × x (t−3) −x (t−4) ”in [Formula 4] Linear prediction component based on ~ 4 samples. Using this linear prediction component and the prediction errors “e1 (t−1) / 2” to “e4 (t−1) / 2” (error feedback component) calculated in the immediately preceding sample, a prediction error at time t e1 (t) to e4 (t) are calculated.
[0024]
Subsequently, the linear prediction error that minimizes the cumulative error, which is the cumulative absolute value of the prediction error value for each prediction calculation formula, is selected as the prediction error of the sample (step S2). Here, the concept of cumulative error is used. Specifically, cumulative values of past samples of prediction errors calculated by the respective prediction calculation formulas [Formula 1] to [Formula 4] are set as R1 to R4. Then, a prediction error corresponding to the smallest one of the accumulated errors R1 to R4 is selected. For example, it is assumed that R2 is the smallest among R1 to R4. In this case, the prediction error e2 (t) calculated by [Formula 2] is selected as the prediction error e (t) to be encoded. The selected prediction error e (t) is replaced with the original value x (t) of the sample and the subsequent processing is performed. Also, the order of the prediction formula used at this time is recorded as optimum order data in association with the sample number. The “order” is a numerical value indicating how many samples have been used in the past in calculating the prediction error, and the above [Expression 1] to [Expression 4] correspond to the first to fourth orders. For example, when the prediction error e2 (t) is selected as the prediction error e (t), the order is “2”.
[0025]
Subsequently, the absolute values of the prediction errors e1 (t) to e4 (t) are added to the accumulated errors R1 to R4 (step S3). Specifically, as shown in the following [Formula 5], the variables R1 to R4 that become the accumulated error values are updated. At the same time, each time the processing of each sample is performed, a process of incrementing the counter by one is performed.
[0026]
[Formula 5]
R1 ← R1 + | e1 (t) |
R2 ← R2 + | e2 (t) |
R3 ← R3 + | e3 (t) |
R4 ← R4 + | e4 (t) |
[0027]
Subsequently, it is determined whether the counter has exceeded a predetermined number of times (step S4). In this embodiment, this predetermined number is set as 100 times. That is, it is determined whether or not the counter exceeds 100.
[0028]
As a result, if the counter exceeds 100, the accumulated error is halved (step S5). Specifically, as shown in [Formula 6] below, variables R1 to R4 that are accumulated errors are divided by two. At the same time, the counter is reset to zero. That is, R1 to R4 here are not cumulative errors in a pure sense, but are moving averages of cumulative errors. In the present embodiment, the maximum 100 samples immediately before are accumulated, but the previous ones are processed in half. Thereby, the influence of the sample separated in time is made small.
[0029]
[Formula 6]
R1 ← (R1) / 2
R2 ← (R2) / 2
R3 ← (R3) / 2
R4 ← (R4) / 2
[0030]
By executing the processing of step S1 to step S5 over all samples at all times in the time-series signal, the values of all samples are replaced with the target error e (t) from the original amplitude value x (t). become. However, since the processing by the prediction error conversion means 20 only changes the value of each sample, the state schematically showing the acoustic signal remains as shown in FIG. Therefore, in order to distinguish the sample after processing by the prediction error conversion means 20 from the original sample, it can also be called an error sample.
[0031]
(Calculation between channels)
Next, an inter-channel difference calculation is performed by the inter-channel calculation means 30 on the sample sequence of each channel in which the prediction error value is recorded. This is done by simply taking the difference between the sample values at the same time. The result of the difference calculation is given as a sample string of one channel, and the value of the sample string of the other channel is left as it is. Specifically, in the case of a two-channel stereo sound signal as shown in FIG. 2C, the value of the L signal is recorded as it is in Ch1, and the difference value of RL is given to Ch2. In general, in a stereo sound signal, there is a correlation between the data at the same time, and the difference value between the two data at each time is a smaller value than the original value. This is the same even if the value is after the prediction code by linear prediction. Therefore, in the example of FIG. 2D, the value of each sample in Ch2 becomes small, and the room for later compression becomes large.
[0032]
(Correlation frame detection)
Subsequently, the correlation frame detecting means 40 sets a frame having a predetermined section length for the sample string of each channel for which the inter-channel calculation is performed, and compares the set frames. In this embodiment, the frame length is fixed over the entire interval from the start time to the end time of the sample string. Specifically, one frame is 512 samples. Correlation frame detection means 40 sets 512 samples from the beginning of the sample sequence of each channel as one frame, and obtains a correlation frame in which all samples match between frames. A specific procedure will be described with reference to the flowchart of FIG.
[0033]
First, the correlation frame detecting means 40 performs framing in units of a predetermined number of samples (step S11). In the present embodiment, as described above, the frame length is fixed length 512 samples over the entire section from the start time to the end time of the sample sequence. As shown in FIG. 5A, the correlation frame detection means 40 sets 512 samples from the beginning of the sample sequence as one frame.
[0034]
Next, a search is made for a frame in which all sample values constituting each frame match. Specifically, as shown in FIG. 5B, first, among the set frames, the last frame in time is set as a target frame for searching for a correlation frame. Next, within the predetermined search range, a sample having the same value as the value of the first sample of the target frame is searched while going back in time (step S12). For example, as shown in FIG. 6A, it is assumed that the target frame is composed of 512 samples kT to kT + 511. In this case, first, a sample that is the same as the sample value e (kT) of the first sample kT of the target frame is searched. Search is performed in order of sample kT-1 and sample kT-2. In FIG. 6, k indicates the k-th frame from the beginning, and T indicates the frame length (512 samples in the present embodiment).
[0035]
If a matching sample t is found (step S13), it is then compared whether the next sample t + 1 of the sample t matches the second sample kT + 1 of the target frame. In this way, as long as the sample values match, the subsequent samples are compared (step S14). In step S14, the process is repeated as long as the values of e (t + p) and e (kT + p) match. For example, in the example shown in FIG. 6B, since e (t) to e (t + 8) coincide with e (kT) to e (kT + 8), the process of step S14 is continued with p = 9. Will be. When all e (t + p) and e (kT + p) from p = 0 to p = 511 match (step S15), the sample sequence is set as a correlation frame for the target frame, and the first sample number of the correlation frame and the target frame Is recorded as frame correlation data in association with the first sample number, and the target frame is deleted from the original sample sequence (step S16). If it does not match all the samples of the target frame, the search is further made in time to determine whether there is a sample whose value matches the first sample of the target frame. If there is no matching correlation frame even after a predetermined number of samples, the search for the correlation frame related to the target frame is stopped, and the correlation frame search is performed using the frame immediately before the target frame as a new target frame. When the processing for one target frame is completed, the process returns to step S12, and the processing is continued using the immediately previous frame as a new target frame (step S17). In this way, correlation frame detection processing is performed using all frames except for a frame located near the start time of the time-series signal as target frames.
[0036]
Looking at the entire sample sequence of the time series signal, if a correlation frame corresponding to the target frame is detected as shown in FIG. 5C, the target frame is deleted as shown in FIG. 5D. Become. At this time, frame correlation data as shown in FIG. 5E is recorded so that it can be completely restored at the time of decoding. As shown in FIG. 5E, in the frame correlation data, the head sample number of the target frame and the head sample number of the correlation frame are recorded in association with each other.
[0037]
(Separation of upper and lower bits)
Subsequently, the data separation means 50 separates the upper bits and lower bits of each sample. Actually, as a pre-process before separation, the value of each sample taking a positive / negative value is converted into a bit string having a positive / negative polarity. Specifically, a bit string expressing a positive / negative value with 16 bits is converted so that the leading 1 bit is a positive / negative polarity code and the other 15 bits indicate an absolute value. When converted in this way, “0” can be omitted because no polarity code is required. As a result, the number of samples whose value is “0” × 1 bit can be reduced.
[0038]
Once the polarity processing is performed, the data separation means 50 next performs the separation of the upper bits and the lower bits of each sample. For example, when an acoustic signal is digitized by PCM and sampled with 16 quantization bits, each sample is represented by 16 bits. In this case, in this embodiment, the upper bit is separated into 12 bits and the lower bit is divided into 4 bits. This separation is basically performed in order to separate the thermal noise of a circuit used when digitizing an acoustic signal such as an A / D converter. Therefore, lower bits that are considered to be thermal noise are separated. The degree to which the lower bits are separated varies depending on the characteristics of the sound source and the circuit used, but it is preferably about 1/4 of the number of normal quantization bits. Therefore, here, 4 bits corresponding to 1/4 of 16 bits are separated as lower bits. The present invention is particularly characterized in that the separation of the upper bits and the lower bits is performed after conversion into a prediction error. This is because if conversion to prediction error is performed on the upper samples after separating the upper bits and lower bits, the compression processing is performed even if components that can be compressed by conversion to prediction errors are included in the lower bits. This is because the compression efficiency may decrease as a whole.
[0039]
Here, the state of data separation by the data separation means 50 is schematically shown in FIG. In FIG. 7, H indicates upper bits or upper sample data, and L indicates lower bits or lower sample data. FIG. 7A shows sample data before separation. The data separation means 50 separates the sample data into upper sample data shown in FIG. 7B and lower sample data shown in FIG. The sign bit included in the upper bits is included in the upper sample data as it is and separated. In the example of FIG. 7, as indicated by “H4”, when the sign bit is deleted by the preprocessing, the upper sample data without the sign bit is obtained. The sample data separated as described above will be processed separately thereafter.
[0040]
(Encoding of upper sample)
Next, the upper sample encoding means 60 encodes the separated upper samples. First, the signal flat part process is performed on the upper sample string of each channel. The signal flat part processing performed by the higher-order sample encoding means 60 is exactly the same as the processing performed by the signal flat part processing means 10. In other words, a portion where the same signal level continues in the upper sample string is composed of three values: the start time position of the signal flat portion, the number of samples that the same signal level continues, and the signal level (sample value). As the upper signal flat portion data, it is recorded separately from the upper sample string of each channel. The upper signal flat portion data is recorded in the same format as the signal flat portion data shown in FIG.
[0041]
Subsequently, the upper sample encoding means 60 converts the fixed length upper sample string into a variable length. First, a lookup table used for converting the bit configuration is created. In creating the lookup table, a histogram of each upper sample value is calculated over the entire time of the upper sample column. Since all the upper sample values are converted into absolute values by the data separating means 50, a histogram is calculated without distinguishing between positive and negative. As a result, when the sample absolute value type is 640 or more, the separator bit is a 2-bit fixed value “00”, and when the sample absolute value type is 639 or less, the separator bit is a 1-bit fixed value “0”. " Furthermore, a bit pattern having a smaller number of bits is assigned in order from the sample absolute value having the highest appearance frequency. At this time, there is a rule for the bit pattern to be assigned, and the most significant bit is always “1”. When the separator bit is 2 bits “00”, the bit pattern including the bit pattern “001” is prohibited. When the bit is 1 bit “0”, a bit pattern including a bit pattern of “01” is prohibited. Also, there is only one lookup table when the separator bit is 2 bits “00”, but the lookup table when the separator bit is 1 bit “0” is when the sample absolute value type is 320 or more. If the number is less than 320, a different one is created. Examples of lookup tables corresponding to the number of types of sample absolute values are shown in FIGS.
[0042]
Using the lookup table created as described above, the upper sample encoding means 60 converts the continuous upper sample data having a fixed length of 12 bits into a variable length bit pattern. Since it becomes a variable length, it becomes necessary to distinguish the delimiter of each data after conversion. For this reason, in the present embodiment, 1-bit or 2-bit separator bits as described above are inserted between the data. When the sample value type is less than 320, the bit string and the number of bits for expressing the data of each rank are as shown in FIG. In FIG. 8A, the rank 0 is represented by 1 bit “1” having the smallest number of bits. In FIG. 8A, the bit string before conversion is omitted, but the bit string that appears most frequently is converted to 1 bit “1”. In addition, since a separator is always added to each variable-length bit, 2 bits are required to express the data of rank 0. When the type of sample value shown in FIG. 8A is less than 320, since the separator bit is 1 bit “0”, the bit pattern “01” is not assigned.
[0043]
When the sample value type is 320 or more and less than 640, the bit string and the number of bits for expressing the data of each rank are as shown in FIG. FIG. 8B shows a new bit string obtained by adding 1 bit subsequent to the most significant 1 bit of each bit string in the lookup table shown in FIG. For example, in FIG. 8B, “10” in the ranking 0 and “11” in the ranking 1 are 1 bit “0” and “1” in “1” in the ranking 0 in FIG. 8A, respectively. In FIG. 8B, “100” in the second rank and “110” in the third rank in the second bit of “10” in the first rank in FIG. "0" and "1" are added respectively. Also in FIG. Since a separator is always added to each variable length bit, 3 bits are required to express the data of rank 0. In the example of FIG. 8B, since the separator bit is 1 bit “0”, the bit pattern of “01” is not assigned, but correct data at the time of decoding is devised by devising the order of data reading. Can be extracted.
[0044]
When the separator bit is 2 bits “00”, the bit string and the number of bits for expressing the data of each rank are as shown in FIG. In FIG. 9, the rank 0 is expressed by 1 bit “1” having the smallest number of bits. Also in FIG. 9, the bit string before conversion is omitted, but the bit string that appears most frequently is converted to 1 bit “1”. In addition, since a separator is always added to each variable length bit, 3 bits are required to express the data of rank 0. In the example of FIG. 9, since the separator bit is 2 bits “00”, the bit pattern “001” is not assigned.
[0045]
10A and 10B schematically show the state of data conversion by the higher-order sample encoding means 60. FIG. 10 (a) and 10 (b) correspond to the upper part of the sample sequence, and FIG. 10 (a) shows a state where fixed-length upper samples are continuously recorded. The high-order sample sequence as shown in FIG. 10A is converted as shown in FIG. 10B using the lookup tables shown in FIGS. 8A and 8B and FIG. .
[0046]
(Low-order sample encoding)
On the other hand, the lower sample data is processed by the lower sample encoding means 70. Specifically, the lower 2 bits of data separated by the data separation means 50 are continuously arranged.
[0047]
(Recording of code data)
The code data obtained as described above is as shown in FIG. That is, the upper variable length sample sequence, the lookup table, the upper signal flat portion data, the lower fixed length sample sequence, the frame correlation data, the signal flat portion data, and the inter-channel data. Since these data are stored in the storage means 2 during the encoding process, the data is recorded in a format that matches the recording medium to be recorded.
[0048]
(Analysis of code data)
The code data is analyzed by the analysis means 3. The process of the analysis means 3 is demonstrated using the flowchart of FIG. First, the data amount of the quantization noise component is calculated (step S21). This is calculated by measuring the data amount of the lower fixed length sample string in the code data. In the compression in this device, since a predetermined number of lower bits are originally separated as quantization components and encoded with a fixed length, the data amount of this lower fixed length sample string is the quantization noise of the original digital acoustic signal. It can be assumed that it is a component. Next, the data amount of the frame correlation data is calculated (step S22). This is done by measuring the amount of frame correlation data. Subsequently, the data amount of the target frame deleted from the original sample sequence is calculated (step S23). This is performed by calculating the data amount of the target frame deleted from the contents of the frame correlation data. Specifically, the calculation is performed by multiplying the target frame in the frame correlation data by the number of samples (512 samples in this embodiment) and the number of bits of each sample (16 bits in this embodiment).
[0049]
Next, the data amount of the signal flat portion data is calculated (step S24). This is performed by measuring the data amount of the signal flat portion data in the code data. Subsequently, the data amount of the signal flat portion of the original digital audio signal is calculated (step S25). This calculates the data amount of the original signal flat part from the content of the signal flat part data. Specifically, the calculation is performed by multiplying the number of samples in the signal flat portion data by the necessary number of bits (16 bits in this embodiment).
[0050]
Next, the data amount for each channel of the upper variable length sample string is calculated. (Step S26). This is performed by measuring the data amount for each channel of the upper variable length sample string in the code data. Subsequently, the data amount of the linear prediction target component of the original digital acoustic signal is calculated (step S27). This is based on the data amount of the quantization noise component calculated in step S21, the data amount of the deleted frame calculated in step S23, and the original signal flat portion calculated in step S25 from the data amount of the original digital audio signal. Calculated by reducing the amount of data. Since the data amount in each channel is the same, the linear prediction target component for each channel is calculated by further dividing by the number of channels. Finally, the ratio of each calculated data to the original sound signal is calculated (step S28).
[0051]
Information analyzed by the analysis means 3 is displayed on the display means 4. Here, the state of the display screen at this time is shown in FIG. In FIG. 13, the upper part shows the composition ratio of the original sound signal before compression, and the lower part shows the composition ratio of the code data after compression. Each configuration data shown in FIG. 13 is actually displayed in different colors. In addition to the configuration data, the ratio of the configuration data to the original digital audio signal is displayed as a percentage. The ratio of each configuration data is calculated and displayed as a ratio with respect to the original digital sound signal for each configuration data after compression. In the upper and lower stages, the corresponding data is displayed in the same color. For example, the linear prediction encoding target component L is displayed in the same color as the prediction encoding compression component L. Note that L and R in each data in FIG. 13 indicate channels. In the example of FIG. 13, since two-channel stereo sound signals are used as targets, two channels are shown. In the example of FIG. 13, the data amount as a whole is compressed to about 50%, and it can be seen that the predictive encoding target component, the signal flat part, and the deleted frame part contribute significantly to the compression rate.
[0052]
(Output of optimal order)
Further, the analysis unit 3 converts the optimum order data recorded by the prediction error conversion unit 20 into a predetermined display format and causes the display unit 4 to display it. The state of the screen of the display means 4 at this time is shown in FIG. In FIG. 14, the horizontal axis represents time, and the vertical axis represents the order. Displaying in the format as shown in FIG. 14 provides a reference as to what prediction formula can be used to perform optimum compression.
[0053]
(Frame correlation output)
The analysis unit 3 converts the frame correlation data recorded by the correlation frame detection unit 40 into a predetermined display format and causes the display unit 4 to display the frame correlation data. The state of the screen of the display means 4 at this time is shown in FIG. In FIG. 15, the time-series sample strings are indicated by rectangles in both the upper and lower stages. The horizontal axis represents time, the left end of the rectangle indicates the start time, and the right end indicates the end time. A vertical line segment in the upper sample row indicates a correlation frame, and a thick vertical line segment in the lower sample row indicates a target frame. The upper sample row and the lower sample row are the same, but they are displayed separately to show the relationship between the target frame and the correlation frame in an easy-to-understand manner. Corresponding correlation frames and target frames are shown connected by dotted lines. The example of FIG. 15 shows that 11 correlation frames are detected for 11 target frames. As can be seen from FIG. 15, the correlation frame is always past in time than the target frame. By outputting the analysis data as shown in FIG. 15 as visible information, it is possible to obtain information such as how much correlation there is in the time-series signal. Useful for studying effective compression.
[0054]
(Predictive error component audio output)
Apart from the visual analysis as described above, this apparatus also has a function of outputting a part of the code data or data generated in the process of generating the code data as an acoustic signal. When the sample sequence obtained by the prediction error conversion means 20 is converted by the acoustic signal conversion means and then output, the prediction error component can be output as speech. At the same time, the signal waveform is output by the display means 4. As a result, it is possible to obtain a specially effective acoustic signal that uses data that is originally a prediction error component as a new acoustic signal. For example, when processing is performed on a PCM acoustic signal having a waveform as shown in FIG. 16, a waveform prediction error component as shown in FIG. 17 is obtained.
[0055]
(Sound output of upper prediction error component)
Further, when the upper fixed length sample sequence generated in the generation process of the upper variable length sample sequence is converted by the acoustic signal converting means and then output, the upper bit component of the prediction error component can be output as speech. At the same time, the signal waveform is output by the display means 4. As a result, a special effective acoustic signal is obtained in which the data that is essentially the main component of the prediction error component is used as a new acoustic signal. For example, when processing is performed on a PCM acoustic signal having a waveform as shown in FIG. 16, a higher order prediction error component of the waveform as shown in FIG. 18 is obtained.
[0056]
(Lower prediction error component audio output)
Further, when the lower fixed-length sample sequence is converted by the acoustic signal converting means and then output, the lower bit component of the prediction error component can be output as speech. At the same time, the signal waveform is output by the display means 4. As a result, a special effective acoustic signal is obtained in which the data that is originally the quantization noise component of the prediction error component is used as a new acoustic signal. For example, when processing is performed on a PCM acoustic signal having a waveform as shown in FIG. 16, a lower prediction error component of the waveform as shown in FIG. 19 is obtained.
[0057]
(Decryption)
Next, decoding of code data encoded by the encoding device will be described. FIG. 20 is a functional block diagram showing the configuration of the time-series signal decoding apparatus according to the present invention. In FIG. 20, 91 is a data reading means, 92 is a high-order sample converting means, 93 is a data integrating means, 94 is a frame restoring means, 95 is a channel restoring means, 96 is an independent sample restoring means, and 97 is a signal flat part inserting means. is there. The configuration shown in FIG. 20 is realized by a computer and a dedicated software program installed in the computer.
[0058]
Next, the processing operation of the decoding device shown in FIG. 20 will be described. First, the data reading means 91 reads a recording medium on which code data as shown in FIG. 11 is recorded. The data reading unit 91 passes the upper variable length sample string and the lookup table among the read data to the upper sample conversion unit 92. By referring to the lookup table, the upper sample conversion means 92 restores an upper fixed length sample sequence having a fixed length of 12 bits (11 bits for values of “0”) from the upper variable length sample sequence. go. At this time, if the lookup table is as shown in FIG. 8A or FIG. 9, there is no problem if the bit data of the upper variable length sample sequence is read and restored in order, but FIG. In the case of the lookup table as shown in b), it is necessary to devise at the time of conversion. In this case, since the separator bit is 1 bit “0”, the bit pattern “01” should be prohibited originally, but as shown in FIG. 8B, the converted bit string contains “01”. Some of them contain bit patterns. Therefore, in the present embodiment, this is dealt with by changing the bit pattern writing order. Specifically, in the case of FIG. 8A or FIG. 9, the first bit that is always 1 is written last, the second bit is written, and in FIG. 8B, the first and second bits are written. Is written last, and is written from the third bit. For example, the bit string “101” in the fourth rank includes a bit pattern “01”. In such a bit string, first, the third bit “1” is read, and is composed of a separator bit and a first bit. Since the “01” pattern is recognized and the second bit is read last, there is no erroneous recognition of the separator. In this case, the upper sample conversion means 92 recognizes the bit string “101” and can restore the original fixed-length bit string according to the lookup table.
[0059]
Further, the upper sample conversion means 92 inserts the read upper signal flat portion data into a predetermined position of the upper fixed length sample string. Subsequently, the data integration unit 93 integrates the upper fixed length sample string and the lower fixed length sample string. Specifically, 12 bits are extracted from the upper fixed-length sample sequence, 4 bits are extracted from the lower fixed-length sample sequence, and integration processing is performed sequentially. Further, subsequently, the data integration means 93 converts the sample string expressed by the positive / negative positive / negative polarity part 1 bit and the numerical value part 15 bits into 16 bits taking a positive / negative numerical value.
[0060]
For such a sample string, the frame restoration means 94 defines, in the frame correlation data, a section having the same sample string as the sample string in the section corresponding to the correlation frame defined in the frame correlation data. The frame is restored by inserting it at the address position of the target frame. As a result, the sample sequence as shown in FIG. 2D is restored. Further, the channel restoration means 95 uses the inter-channel information to recognize which channel's sample sequence is the original and which channel's sample sequence is the difference information from which channel's sample sequence. Restore the sample column. At this time, since the value of each sample is recorded with a prediction error based on any number of sample values from the previous one to four, the independent sample restoration means 96 performs the above [Expression 1] to [Expression 4]. ], The original sample value x (t) is sequentially restored on the basis of the expression obtained by exchanging the term on the left side and the first term on the right side. Finally, the signal flat portion inserting means 97 uses the signal flat portion data as shown in FIG. 2 (e) to insert the signal flat portion at a predetermined position of the sample row as shown in FIG. 2 (b). To do. As a result, the digital audio signal in a state where the analog signal is converted to PCM is restored without data loss.
[0061]
【The invention's effect】
As described above, according to the present invention, with respect to a time-series signal composed of time-series sample sequences, the values of the samples in the sample sequence are predicted error values from a plurality of samples in the past in time. A bit data representing each sample value converted to a prediction error value is set to a position where the bit data is divided, the bit data is divided at the set bit position, and a high-order sample string composed of a high-order bit sample string, Separated into lower-order sample sequences composed of lower-order bit sample sequences, upper-order sample sequences are encoded with variable-length codes, and lower-order sample sequences are encoded with fixed-length codes. In the time series signal, the ratio of data corresponding to the upper sample string, the ratio of data corresponding to the lower sample string, the ratio of data encoded by encoding the upper sample, and the encoding of the lower sample Sign Since so as to display the rate of data, time-series signals as well as efficient lossless compression, an effect that a compression accuracy of the compressed data can be analyzed.
[Brief description of the drawings]
FIG. 1 is a functional block diagram showing an embodiment of a time-series signal encoding apparatus according to the present invention.
FIG. 2 is a diagram showing a state of processing by a signal flat part processing unit 10 and an inter-channel calculation unit 30.
FIG. 3 is a diagram showing a state of a prediction error calculation process by a prediction error conversion means 20;
FIG. 4 is a flowchart showing processing by the inter-frame calculating means 40.
FIG. 5 is a diagram showing an overall state of a time-series signal by processing of an inter-frame calculating unit 40.
FIG. 6 is a diagram showing a state of samples compared by the processing of the inter-frame calculating means 40.
7 is a diagram showing a state of processing by a data separating means 50. FIG.
FIG. 8 is a diagram illustrating an example of a lookup table when the type of sample absolute value is less than 640.
FIG. 9 is a diagram illustrating an example of a lookup table when the type of sample absolute value is 640 or more.
FIG. 10 is a diagram schematically illustrating the conversion of the bit length of an upper sample.
FIG. 11 is a diagram showing code data obtained by the time-series signal encoding apparatus according to the present invention.
FIG. 12 is a flowchart showing a state of processing by the analyzing means 3;
FIG. 13 is a diagram showing a display example of each data ratio processed by the analysis means 3;
FIG. 14 is a diagram showing optimum order data displayed on the display means 4;
FIG. 15 is a diagram showing the frame correlation displayed on the display means 4;
FIG. 16 is a diagram illustrating a waveform of an original sound signal.
17 is a diagram showing a waveform of a prediction error component obtained by processing the original sound signal of FIG.
18 is a diagram showing a waveform of a higher order prediction error component obtained by processing the original sound signal of FIG.
19 is a diagram showing a waveform of a lower prediction error component obtained by processing the original sound signal of FIG.
FIG. 20 is a functional block diagram showing a time-series signal decoding apparatus.
[Explanation of symbols]
1 ... Time-series signal input means
2. Storage means
3. Analytical means
4. Display means
5 ... Separation position means
6 ... Acoustic signal conversion means
7 ... Audio output means
10: Signal flat part processing means
20 ... Prediction error conversion means
30 ... Channel calculation means
40. Interframe calculation means
50. Data separation means
60: Upper sample encoding means
70 ... Lower sample encoding means
91 ... Data reading means
92 ... Upper sample conversion means
93. Data integration means
94 ... Frame restoration means
95: Channel restoration means
96 ... Independent sample restoration means
97 ... Signal flat part insertion means

Claims

An apparatus for compressing the information amount and analyzing the compressed information so that all the sample sequences can be reproduced with respect to a time-series signal composed of time-series sample sequences,
Prediction error conversion means for converting the value of each sample in the sample sequence into prediction error values from a plurality of samples in the past in time;
Set a position to divide bit data representing each error sample value converted to the prediction error value, divide at the set bit position, and an upper sample string composed of upper bit sample strings and lower bits A data separation means for separating the data into a lower sample sequence composed of a plurality of sample sequences;
For the upper sample sequence, upper sample encoding means adapted to perform encoding with a variable length code;
For the lower sample sequence, lower sample encoding means for encoding with a fixed length code;
The ratio of data corresponding to the upper sample string, the ratio of data corresponding to the lower sample string, the ratio of data encoded by the upper sample encoding means, and the lower sample in the time series signal A data display means for displaying a ratio of data encoded by the encoding means;
A time-series signal compression analysis apparatus.

In claim 1,
In the sample sequence, a signal flat portion in which the sample values are continuously the same value is extracted, separated from the sample sequence, and the start time position of the separated sample, the number of samples, and the sample value A signal flat part encoding means for encoding three values as signal flat part data;
The time-series signal compression analysis apparatus, wherein the data display means further displays the ratio of the signal flat part and the ratio of the signal flat part data.

In claim 1,
When the sample string is composed of a plurality of channels having a plurality of values at the same time, an inter-channel calculation is performed by applying a predetermined calculation to the sample string between the channels and updating the sample string of any channel. Further comprising means,
The time-series signal compression analysis apparatus, wherein the data display means further displays the ratio of each data for each channel.

In claim 1,
By setting a frame composed of a predetermined number of sample sequences from the sample sequence and searching for past samples in time with each frame as a target frame, all samples are all samples of the target frame. It is detected whether or not there is a correlation frame having the same value, and when there is a correlation frame, information associating the target frame with the correlation frame is encoded as frame correlation data, and each sample of the target frame is encoded with the sample sequence. Further comprising a frame detection means for deleting from
The time series signal compression analysis apparatus, wherein the data display means further displays the position of the target frame in the time series signal when the ratio of each data is displayed.

A device that compresses the amount of information so that all the sample sequences can be reproduced and converts the data created in the compression process into an acoustic signal and outputs it to a time-series signal composed of time-series sample sequences. There,
Prediction error conversion means for converting the value of each sample in the sample sequence into prediction error values from a plurality of samples in the past in time;
Set a position to divide bit data representing each error sample value converted to the prediction error value, divide at the set bit position, and an upper sample string composed of upper bit sample strings and lower bits A data separation means for separating the data into a lower sample sequence composed of a plurality of sample sequences;
For the upper sample sequence, upper sample encoding means adapted to perform encoding with a variable length code;
For the lower sample sequence, lower sample encoding means for encoding with a fixed length code;
Acoustic signal conversion means for converting the data created in the prediction error conversion means or data separation means into an acoustic signal;
Voice output means for outputting the converted acoustic signal as voice;
Display means for displaying the converted acoustic signal as a waveform;
A time-series signal conversion device characterized by comprising:

In claim 5,
It said acoustic signal conversion means converting apparatus of the time-series signal, wherein the converted sample sequence in the prediction error converter means is for converting an acoustic signal.

In claim 5,
The time-series signal conversion apparatus according to claim 1, wherein the acoustic signal conversion means converts the upper sample sequence separated by the data separation means into an acoustic signal .

In claim 5,
The time-series signal conversion apparatus according to claim 1, wherein the acoustic signal conversion means converts the lower sample sequence separated by the data separation means into an acoustic signal .