JP3884131B2

JP3884131B2 - Data compression device and data decompression device

Info

Publication number: JP3884131B2
Application number: JP21512597A
Authority: JP
Inventors: 正昭勝俣
Original assignee: Roland Corp
Current assignee: Roland Corp
Priority date: 1997-08-08
Filing date: 1997-08-08
Publication date: 2007-02-21
Anticipated expiration: 2017-08-08
Also published as: JPH1152996A

Description

【０００１】
【発明の属する技術分野】
本発明は、音声データの連続からなるフレーズデータにデータ圧縮処理を施すデータ圧縮装置、音声フレーズを表す、圧縮されたフレーズデータにデータ伸長処理を施すデータ伸長装置に関する。尚、本発明において音声とは、人間の声の他、楽音や効果音など可聴域の音全てを含む概念をいう。
【０００２】
【従来の技術】
従来から、音声フレーズを表すフレーズデータを記憶媒体に記憶したり、通信媒体を介して送受信することが広く行われている。この場合、フレーズデータにデータ圧縮を施して圧縮データを生成し、この圧縮データを記憶したり、送受信すると記憶容量や通信時間が少なくて済むので経済的である。また、経済性をより高めるためにはより高い圧縮率を有する圧縮処理を施す方がよい。しかし、より高い圧縮率を有する圧縮処理が施されて生成された圧縮データに基づいて再生された楽音は音質がより低下しがちなので、再生される楽音の音質を許容範囲内に維持するためには、より低い圧縮率を有する圧縮処理を選ぶ必要がある。従って経済性と音質との兼ね合いで所望の圧縮率が決まり、その所望の圧縮率を有する圧縮処理が選ばれることとなる。
【０００３】
【発明が解決しようとする課題】
ところで、音声には、ピアノの演奏音等のように高い音質を必要とする音声（楽音）もあれば、ノイズ状の効果音などのように低い音質でもよい音声もある。従って、経済性と音質との兼ね合いで決まる所望の圧縮率は音声の種類に応じて変わるので、その音声の種類に応じた所望の圧縮率を有する圧縮処理を選んで圧縮データを生成することが望ましい。
【０００４】
しかし、このように生成された、複数の種類の音声を表す、圧縮率が異なる複数の圧縮データに基づいて、複数の種類の音声を再生する場合には、複数の圧縮データそれぞれを生成した圧縮処理に対応する伸長処理をそれら複数の圧縮データそれぞれに施す必要があるため、伸長処理が煩わしいという問題や同時再生が困難であるという問題がある。
【０００５】
本発明は、上記事情に鑑み、圧縮率が異なる複数の圧縮データに基づいて、複数の音声を再生する場合であっても伸長処理や同時再生が容易なデータ圧縮装置およびデータ伸長装置を提供することを目的とする。
【０００６】
【課題を解決するための手段】
上記目的を達成する本発明のデータ圧縮装置は、音声フレーズを表す、音声データの連続からなるフレーズデータにデータ圧縮処理を施すデータ圧縮装置において、
音声フレーズに圧縮率を規定する複数のパラメータを対応づけるパラメータ対応付手段と、
圧縮率可変のデータ圧縮アルゴリズムを用いて、音声フレーズを表すフレーズデータを、その音声フレーズに対応づけられた複数のパラメータに応じた圧縮率でデータ圧縮するデータ圧縮手段とを備えたことを特徴とする。
【０００７】
尚、本発明において音声フレーズは、１つの音、もしくは一連の複数の音の連続から成るものであり、フレーズデータは、音の波形をあらわす波形データである。
このデータ圧縮装置によれば、単一のデータ圧縮アルゴリズムで、圧縮率が異なる複数の圧縮データを生成することができるので、これら複数の圧縮データの伸長処理は、単一のデータ伸長アルゴリズムに基づいて容易に行うことができる。
【０００８】
本発明のデータ圧縮装置は、上記データ圧縮手段が、上記フレーズデータに圧縮処理を施すにあたりフレーズデータが所定数の連続した音声データ毎に分割されてなる各区分毎に、区分に含まれる連続した音声データそれぞれのデータレベルにそれぞれ対応するデータレベルに共通な、データレベルの基準を表す指数データ部を、それら連続した音声データに基づいて生成するとともに、基準のデータレベルを単位として、その区分に含まれる連続した音声データのデータレベルに対応するデータレベルそれぞれを、そのフレーズデータに対応づけられてなるパラメータおよび基準のデータレベルに基づいて決定されるビット数で表す仮数データ部を生成し、区分に含まれる一連の音声データを、指数データ部および仮数データ部からなる圧縮データに置き換えるものであることが好ましい。
【０００９】
ここで、「音声データそれぞれのデータレベルにそれぞれ対応するデータレベル」「音声データのデータレベルに対応するデータレベル」は、音声データそれぞれのデータレベルそのものであってもよく、音声データどうしの差分データのデータレベルであってもよく、もともとの音声データのデータレベルを復元できるものであればどのようなものであってもよいことを意味している。
【００１０】
上記目的を達成する本発明のデータ伸長装置は、音声フレーズを表す、音声データの連続からなるフレーズデータが、圧縮率可変のデータ圧縮アルゴリズムにより、音声フレーズに応じた圧縮率でデータ圧縮されてなる圧縮フレーズデータにデータ伸長処理を施すデータ伸長装置において、
圧縮フレーズデータに対応する、圧縮フレーズデータの圧縮率を規定した複数のパラメータを取得するパラメータ取得手段と、
圧縮フレーズデータを、データ圧縮アルゴリズムに対応するデータ伸長アルゴリズムおよびパラメータ取得手段で取得した複数のパラメータに基づいてデータ伸長するデータ伸長手段とを備えたことを特徴とする。
【００１１】
このデータ伸長装置によれば、本発明のデータ圧縮装置によって生成された、圧縮率が異なる複数の圧縮データを単一のデータ伸長アルゴリズムに基づいて容易にデータ伸長することができる。
本発明のデータ伸長装置は、上記圧縮フレーズデータが、音声フレーズを表す、音声データの連続からなるフレーズデータが所定数の連続した音声データ毎に分割されてなる各区分に含まれる連続した音声データそれぞれのデータレベルにそれぞれ対応するデータレベルに共通な、データレベルの基準を表す指数データ部、および基準のデータレベルを単位として、その区分に含まれる連続した音声データのデータレベルに対応するデータレベルそれぞれを、そのフレーズデータに対応づけられてなる複数のパラメータおよび基準のデータレベルに基づいて決定されるビット数で表す仮数データ部からなる圧縮データによって構成されてなる圧縮フレーズデータであり、
上記伸長手段が、圧縮データを構成する指数データ部を取得し、その圧縮データを含む圧縮フレーズデータに対応するパラメータおよびその指数データ部に基づいて、その圧縮データを構成する仮数部のビット数を求め、そのビット数に応じて仮数データ部を取得し、その圧縮データをデータ伸長するものであることが好ましい。
【００１２】
また、本発明のデータ伸長装置は、上記データ伸長手段が、複数の圧縮フレーズデータそれぞれを、上記データ圧縮アルゴリズムに対応する、これら複数の圧縮フレーズデータに共通のデータ伸長アルゴリズムおよび上記パラメータ取得手段で取得した、これら複数の圧縮フレーズデータそれぞれに応じた複数のパラメータに基づいて、同時にデータ伸長するものであることも好ましい。
【００１３】
さらに本発明のデータ伸長装置は、音声フレーズを表す、音声データの連続からなるフレーズデータが、圧縮率可変のデータ圧縮アルゴリズムにより、音声フレーズに応じた圧縮率でデータ圧縮されてなる複数の圧縮フレーズデータそれぞれを、データ圧縮アルゴリズムに対応する、これら複数の圧縮フレーズデータに共通のデータ伸長アルゴリズムおよび、これら複数の圧縮フレーズデータそれぞれの圧縮率を規定した、これら複数の圧縮フレーズデータそれぞれに応じたパラメータに基づいて、同時にデータ伸長するデータ伸長手段を備えたことを特徴とする。
【００１４】
このデータ伸長装置によれば、圧縮率が異なる複数の圧縮データを単一のデータ伸長アルゴリズムに基づいて容易にデータ伸長することができ、それら複数の圧縮データに基づく複数の音声フレーズを同時に再生することができる。
【００１５】
【発明の実施の形態】
以下、本発明の実施形態について説明する。
図１は本発明のデータ圧縮装置の一実施形態を示すブロック図である。
このデータ圧縮装置１には、メモリ１１と、入力部１２と、ブロック形成部１３とが備えられている。メモリ１１には本発明にいう音声データである、２の補数形式の１６ビットのＰＣＭデータが、複数の音声フレーズ分記憶されており、メモリ１１から入力部１２によってＰＣＭデータが読み込まれてブロック形成部１３に送られ、ブロック形成部１３によってＰＣＭデータ１６個毎に、本発明にいう区分であるブロックが形成される。
【００１６】
なお、入力部１２は本実施形態に限定される必要はなく、ＰＣＭデータを、Ａ／Ｄコンバータから読み出すものであってもよいし、ハードディスク等の外部の記憶装置から読み出すものであってもよいし、通信回線から受け取るものであってもよい。
また、データ圧縮装置１には、データ圧縮部１４と、パラメータ設定部１５と、出力部１６と、メモリ１７とが備えられており、パラメータ設定部１５によって、後述する圧縮処理に必要なパラメータがユーザによって音声フレーズ毎に任意に設定される。このパラメータに基づいて、データ圧縮部１４によって、ブロック形成部１３から１ブロック分ずつ送られてきたＰＣＭデータに、後述するデータ圧縮処理が施されるとともに、その結果生成された圧縮データが１ブロック分ずつ出力部１６へ送られる。以下、表１を参照してデータ圧縮処理について詳細に説明する。
【００１７】
【表１】

【００１８】
表１の左から２列目には１６個のＰＣＭデータが示されており、本実施形態では上述のようにＰＣＭデータ１６個で１ブロックが形成されている。また、表１には１６個のＰＣＭデータそれぞれに対応して絶対値、仮数および伸長データが示されている。なお、ＰＣＭデータ、絶対値、仮数および伸長データはすべて１６進数表示で示されている。以下、１６進数表示で表示されている値については記号「＄」を先頭に付して区別する。また、互いに対応づけられた一組のＰＣＭデータ、絶対値、仮数および伸長データのことを以下サンプルと称し、表１に示された１６個のサンプルには０番から１５番までの番号が付されている。
【００１９】
表１に示す１ブロック分のＰＣＭデータに基づいて圧縮データが作成される手順を以下説明する。まずこれらのＰＣＭデータのなかから最大の絶対値を持つＰＣＭデータが検索される。表１の例では１５番のサンプルのＰＣＭデータがこれに該当する。
次に、最大の絶対値を持つ、１５番のサンプルのＰＣＭデータの符号ビットの最下位が求められる。１５番のサンプルのＰＣＭデータの値「＄Ｆ８Ｃ４」を２進数表示で表すと「１１１１１０００１１０００１００」となる。上述のように本実施形態では２の補数形式のＰＣＭデータが用いられているので、このＰＣＭデータは負であり、ＭＳＢから順にサーチされて最初に０が出現する（ＰＣＭデータが正であれば最初に１が出現する）ビット位置の１ビット手前が符号ビットの最下位となる。以下、ＬＳＢを第０ビットと称し、ＭＳＢ側に向かって順に第１ビット、第２ビット等と称することとすると、上記の例では符号ビットの最下位は第１１ビットである。ＰＣＭデータの値が「＄００００」および「＄ＦＦＦＦ」のいずれかの値である場合には全ビットの値が同一であり、上記の手順では符号ビットの最下位が見つからないので、この場合には第０ビットが符号ビットの最下位とされる。この符号ビットの最下位のビット位置を示す値を以下基準指数と称する。基準指数は、１ブロック内の音声データの最大音量を表しており、表１の例では基準指数は「１１」となる。なお、ＰＣＭデータが１６ビットのデータであり基準指数は０から１５までのいずれかの値となるので、本実施形態では基準指数は４ビットで表される。
【００２０】
１６個のＰＣＭデータそれぞれの、基準指数が示すビット位置からＬＳＢ方向へ、以下で述べる方法で決定されるビット数分、仮数が抽出され、基準指数と１６個の仮数とで１ブロック分の圧縮データが生成される。このとき仮数の最下位が抽出されるビット位置を示す値を以下実効指数と称する。表１の例では基準指数が「１１」なので仮数が７ビットであるとすれば実効指数は「５」となる。
【００２１】
楽音の音質を維持するためには、楽音の音量が大きいときは仮数のビット長を長くする必要があるが、楽音の音量が小さいときには仮数のビット長を短くしても楽音の音質に悪影響を及ぼさないことが実験的に解っている。従って、例えば大音量の楽音の場合には仮数のビット長が８ビットとされ、音量が小さくなるにつれて仮数のビット長が７ビット、６ビットと短くされることで、仮数のビット長が固定されているデータ圧縮処理における音質と同等以上の音質を維持しつつ、仮数のビット長が固定されているデータ圧縮処理における圧縮率より高い圧縮率を実現することができる。
【００２２】
本実施形態においては、仮数のビット長を計算するためのパラメータとして、基準指数が「１５」のときの仮数のビット長である最大ビット長ｍａｘと、基準指数が小さくなるにつれて仮数のビット長が短くなる程度を表すステップｓｔｐとが設定され、以下の計算式によって、ブロック毎に、基準指数ｅｘｐに応じて仮数のビット長ｂｉｔが決定される。
【００２３】
ｂｉｔ＝ｍａｘ−（１５−ｅｘｐ）／ｓｔｐ
但し、演算「ｘ／ｙ」はｘをｙで割った値の整数部を求める演算である。
また、楽音の種類によっては、初めからＰＣＭデータの下位のビットを無視し、実効指数に下限を設けても音質が維持される場合がある。そこで本実施形態ではパラメータとして最低実効指数ｍｉｎが設定され、上記の計算の結果得られた仮数のビット長ｂｉｔと基準指数ｅｘｐとから求められる実効指数（ｅｘｐ−（ｂｉｔ−１））が最低実効指数ｍｉｎよりも小さくなる場合には、仮数のビット長ｂｉｔは以下の計算式で決定される。
【００２４】
ｂｉｔ＝ｅｘｐ−ｍｉｎ＋１
また、当然ながら、実際の仮数のビット長は負の値になり得ないので、仮数のビット長ｂｉｔの計算結果が負の値になった場合には、仮数のビット長は「０」とされ、指数のみの圧縮データとなる。
最大ビット長ｍａｘ、ステップｓｔｐおよび最低実効指数ｍｉｎに基づいて、基準指数から求められた仮数のビット長および実効指数の例を表２および表３に示す。
【００２５】
【表２】

【００２６】
【表３】

【００２７】
表２の例では、最大ビット長ｍａｘが「６」、ステップｓｔｐが「３」に設定されており、基準指数が「３」小さくなる度に仮数のビット長が「１」短くなっている。また、最低実効指数ｍｉｎが「４」に設定されているので基準指数が「５」のとき仮数のビット長は「２」となり、基準指数が「４」のときは仮数のビット長は「１」となって実効指数が「４」に維持される。
【００２８】
表３の例では、最大ビット長ｍａｘが「８」、ステップｓｔｐが「４」に設定されており、基準指数が「４」小さくなる度に仮数のビット長が「１」短くなっている。また、最低実効指数ｍｉｎが「０」に設定されているので基準指数が「０」でも仮数が存在し、表２の例に較べて高い音質が得られる設定となっている。
【００２９】
表１に示す仮数は表３の条件で求められた仮数であり、上述のように基準指数が「１１」なので、仮数のビット数は「７」である。即ち、第１１ビットから第５ビットまでが仮数として抽出されている。１５番のサンプルのＰＣＭデータから仮数が抽出される場合を例にして説明すると、上述のように、１５番のサンプルのＰＣＭデータは２進数表示で「１１１１１０００１１０００１００」であるので、第１１ビットから第５ビットまでが抽出されると「１０００１１０」となる。これを１６進数表示に直せば表１に示すように「＄４６」となる。
【００３０】
以上説明したように、圧縮処理に先立って各パラメータが調整されることによって、圧縮率と音質との最適化が計られる。また、パラメータが変更されることによって、異なる圧縮率の圧縮データが単一のデータ圧縮アルゴリズムに基づいて生成されるので、生成された、圧縮率が異なる複数の圧縮データそれぞれに伸長処理を施すことは容易である。
【００３１】
なお、上述したパラメータの他に、仮数のビット長がｍａｘとなる指数の幅を示すパラメータｅｘｔが設定され、１５−ｅｘｔの値が基準指数ｅｘｐよりも小さい場合には仮数のビット長ｂｉｔが最大ビット長ｍａｘとされ、１５−ｅｘｔの値が基準指数ｅｘｐ以上である場合には以下の計算式によって仮数のビット長ｂｉｔが決定されてもよい。
【００３２】
ｂｉｔ＝ｍａｘ−１−（１５−ｅｘｔ−ｅｘｐ）／ｓｔｐ
ここで演算「ｘ／ｙ」は上記同様に、ｘをｙで割った値の整数部を求める演算である。
図１には図示されていない指令手段を介して、ユーザから圧縮対象のフレーズの指定および圧縮処理開始の指令がなされると、図１に示す出力部１６によって、圧縮処理に先立って、パラメータ設定部１５で設定された各パラメータがメモリ１７のヘッダ領域に記憶され、圧縮処理開始後は、データ圧縮部１４によって生成された圧縮データがメモリ１７の圧縮データ領域に順次記憶される。なお、楽音フレーズの最後のブロックに関しては、音声フレーズの最終のＰＣＭデータがそのブロックの最後のデータでない場合には、最後のブロックのそれより後の仮数データを全て０を示すものとする。また、本実施形態では、圧縮記憶領域は、基準指数が記憶される領域と仮数が記憶される領域とに分けられている。これは、１ワードが１６ビットから構成されるメモリを使用した場合に、基準指数が４ビットであるので４ブロック分の基準指数がメモリの１ワードに丁度収まり、仮数は１６個で１ブロック分であるのでビット長に関わらず１ブロック分のデータ長が１６ビットの整数倍となることから、このように領域を分けた方がデータ管理が容易となるからである。
【００３３】
なお、メモリ配置は本実施形態の配置に限定される必要はなく、同一の領域にパラメータ、基準指数および仮数が連続して格納されてもよいことは当然である。
１つの音声フレーズを表す一連のＰＣＭデータの圧縮処理が終了すると、出力部１６により、その音声フレーズの先頭ブロックの圧縮データを構成する基準指数が記憶されている、基準指数領域のアドレスを示すデータと、その音声フレーズの先頭ブロックの圧縮データを構成する仮数のうちの先頭の仮数が記憶されている、圧縮データ領域のアドレスを示すデータと、その音声フレーズの最終ブロックの圧縮データを構成する仮数のうちの先頭の仮数が記憶されている、圧縮データ領域のアドレスを示すデータと、その音声フレーズの最終のＰＣＭデータに対応する仮数の、そのＰＣＭデータが含まれるブロック内の相対アドレス（ブロックの先頭から何データめかを示す）とがメモリ１７のその音声フレーズのヘッダ領域に追加記憶される。各音声フレーズ毎にパラメータの設定および圧縮処理開始の指令を行なうことにより、各音声フレーズ毎に、圧縮データが生成されるとともにその圧縮データを生成する際に用いられた圧縮率を規定するパラメータとその圧縮データとが対応付けられる。
【００３４】
なお、本実施形態ではパラメータがメモリ１７のヘッダ領域に記憶され、データ伸長装置へ自動的に受け渡されるが、ユーザがパラメータを記録しておいて、データ伸長処理の際にデータ伸長装置に入力してもよい。
また、出力部１６は本実施形態に限定される必要はなく、圧縮データをハードディスク等の外部の記憶装置に書き込むものであってもよいし、通信回線に送り出すものであってもよい。
【００３５】
図２は、本発明のデータ伸長装置の一実施形態を包含する楽音再生装置を示すブロック図である。
この楽音再生装置２には、メモリ２１と、入力部２２と、パラメータ取得部２３とが備えられており、メモリ２１には、図１に示すデータ圧縮装置１によって生成された圧縮データが上述の形式で記憶されている。図示されていない指令手段を介してユーザから与えられたデータ伸長処理開始の指令によって入力部２２は動作を開始し、伸長処理に先立って入力部２２によってメモリ２１のヘッダ領域から、伸長処理の対象となる音声フレーズに対応するパラメータ等のデータが読み込まれパラメータ取得部２３へ送られる。伸長開始後は、入力部２２によってメモリ２１の圧縮データ領域から圧縮データが読み込まれる。パラメータ取得部２３により、入力部２２から送られてきたデータから、伸長処理の対象となる音声フレーズに対応するパラメータ等が抽出される。
【００３６】
楽音再生装置２には、データ伸長部２４と出力部２５が備えられており、メモリ２１から出力部２５までが本発明のデータ伸長装置の一実施形態に相当する。データ伸長部２４は、複数の音声フレーズを表す複数の圧縮フレーズデータを、時分割的に処理することによって平行してデータ伸長することができ、これはデータ伸長部２４が複数個のトラックを有し、それら複数個のトラックそれぞれが伸長対象の１つずつの音声フレーズに対応しており、各トラックが、そのトラックに対応する音声フレーズを表す圧縮フレーズデータのデータ伸長を行うことと同じである。以下ではこのようなトラックが存在するものとして説明する。また、これらのトラックそれぞれに対して独立に、図示されていない指令手段を介してユーザによって任意のフレーズの伸長処理の開始および停止が指示される。以下、各トラックによる伸長処理の原理を説明する。
【００３７】
各トラックは、伸長処理の開始に先立って、そのトラックに対応する音声フレーズに対応するパラメータをパラメータ取得部２３から受け取り、伸長開始後は、このパラメータと、入力部２２から送られてきた基準指数とに基づいてブロック毎に仮数のビット長を計算する。
仮数のビット長が「０」である場合には、トラックによって値が「＄００００」であるＰＣＭデータが伸長データとして１６個生成されて出力部２５へ送られる。なお、仮数のビット長が「０」である場合にはデータ伸長部２４では伸長データの生成が行われずに、仮数のビット長が「０」であることが出力部２５に伝えられ、出力部２５が伸長データを生成するものであってもよい。
【００３８】
仮数部のビット長が正である場合には、このビット長に応じて入力部２２を介して１６個の仮数を取得する。各仮数それぞれが１６個の１６ビットレジスタそれぞれに右詰めで格納され、各仮数のＭＳＢがその上位の全ビットにコピーされる符号拡張が行われ、パラメータおよび基準指数によって求められた実効指数に従って全レジスタが左シフトされ、実効指数が「０」でない場合は仮数のＬＳＢのすぐ下位のビットが１にされる（表１の第４列目参照）。上記の最後の処理は圧縮時に切り捨てを行っていることに対応するもので、例えば表１の例では実効指数が「５」であり、１１番のサンプルの場合にはＰＣＭデータが「＄ＦＢＦＤ」であり、仮数が「＄５Ｆ」なので、この最後の処理を行わなければ伸長データは「＄ＦＢＥ０」となり切り捨て誤差は「＄００１Ｄ」となるが、この最後の処理を行うことで伸長データは表１に示すように「＄ＦＢＦ０」となり、この結果切り捨て誤差は「＄０００Ｄ」に抑えられる。
【００３９】
このようにして伸長処理が行われた結果、圧縮データが伸長されて伸長データとして１６個のＰＣＭデータが形成され出力部２５に供給される。以降、これと同様に各ブロックの圧縮データの伸長が行なわれる。
以下、フロー図を参照してデータ伸長部２４の動作を説明する。
図３は、データ伸長部２４の動作を示すフローチャートである。
【００４０】
ユーザによって第ｎ番目のトラックに対して伸長処理の開始が指示されると、図示しない処理によりそのトラックの処理状況を表すフラグｆｌａｇ［ｎ］が、伸長処理中を表す値「１」に設定され、１ブロック分の圧縮データを構成する仮数のうちの先頭の仮数が記憶されているアドレスを示す変数ｐｏｉｎｔ［ｎ］に、第ｎ番目のトラックに対応する音声フレーズの先頭ブロックの圧縮データを構成する仮数のうちの先頭の仮数が記憶されているアドレスが代入され、ブロック内の相対アドレス（ブロックの先頭から何データめかを示す）を示す変数ｃｏｕｎｔ［ｎ］が「０」に設定される。また、ユーザによって第ｎ番目のトラックに対して伸長処理の停止が指示されると図示しない処理によりフラグｆｌａｇ［ｎ］が、伸長処理停止中を表す値「０」に設定される。
【００４１】
データ伸長部２４は、サンプリング周波数のクロック周期毎にフローチャートに示す動作を実行する。
データ伸長部２４が動作を開始すると、ステップＳ１０１において、トラック番号を示す変数ｎが「０」に初期化され、ステップＳ１０２に進み、フラグｆｌａｇ［ｎ］の値が判定される。
【００４２】
ステップＳ１０２において、フラグｆｌａｇ［ｎ］の値が「０」であると判定された場合には、第ｎトラックは伸長処理停止中であるので、ステップＳ１０３に進み、第ｎトラックのＰＣＭデータとして値「＄００００」が出力部２５に送られ、ステップＳ１１１に進む。
ステップＳ１０２において、フラグｆｌａｇ［ｎ］の値が「１」であると判定された場合には、第ｎトラックは伸長処理中であるので、ステップＳ１０４に進み、基準指数とパラメータとから定まる仮数のビット長に基づいて、変数ｐｏｉｎｔ［ｎ］によって示されるアドレスにその先頭の仮数が記憶されている１ブロック分の仮数の中のそのブロックの先頭からｃｏｕｎｔ［ｎ］番目の仮数に基づいて伸長データが生成されて出力部２５に送られ、ステップＳ１０５に進み、第ｎトラックに対応する音声フレーズを表す圧縮フレーズデータの最後に到達したか否かが判定される。この判定は、ｐｏｉｎｔ［ｎ］をその音声フレーズの最終ブロックの圧縮データを構成する仮数のうちの先頭の仮数が記憶されている圧縮データ領域のアドレスと比較するとともに、ｃｏｕｎｔ［ｎ］をその音声フレーズの最終のＰＣＭデータに対応する仮数の、そのＰＣＭデータが含まれるブロック内の相対アドレスとを比較することにより行なう。
【００４３】
ステップＳ１０５において、圧縮フレーズデータの最後に到達したと判定された場合には、ステップＳ１０６に進み、フラグｆｌａｇ［ｎ］の値が「０」に設定され、これにより第ｎトラックが伸長処理停止中となってステップＳ１１１に進む。
ステップＳ１０５において、圧縮フレーズデータの最後に到達していないと判定された場合には、ステップＳ１０７に進み、変数ｃｏｕｎｔ［ｎ］の値が歩進されてステップＳ１０８に進む。
【００４４】
ステップＳ１０８では、変数ｃｏｕｎｔ［ｎ］の値が「１６」になったか否かが判定され、これによって、１ブロック分のデータが出力されたか否かが判定される。ステップＳ１０８において、変数ｃｏｕｎｔ［ｎ］の値が「１６」になっていないと判定された場合は、１ブロック分のデータが出力されていないことを意味するので、何もせずにステップＳ１１１に進む。ステップＳ１０８において、変数ｃｏｕｎｔ［ｎ］の値が「１６」になったと判定された場合は、１ブロック分のデータが出力されたことを意味するので、ステップＳ１０９において変数ｃｏｕｎｔ［ｎ］の値が「０」に設定されてステップＳ１１０に進み、基準指数とパラメータとに基づいて仮数のビット長を求め、仮数のビット長に応じて変数ｐｏｉｎｔ［ｎ］の値を１ブロック分進め、次のブロックの圧縮データを構成する基準指数を取得して、ステップＳ１１１に進む。
【００４５】
ステップＳ１１１では変数ｎの値が歩進されてステップＳ１１２に進み、変数ｎの値がトラックの数ＴＲＫに到達したか否かが判定される。変数ｎの値がトラックの数ＴＲＫに到達していれば動作を終了し、到達していなければステップＳ１０２に戻って、次のトラックについて上記の動作が繰り返される。
図４は、データ伸長部２４の別の動作を示すフローチャートである。
【００４６】
図３のフローチャートが示す動作は、サンプリング周期毎に実行されて１トラック毎に１つのＰＣＭデータが出力される動作であるが、このフローチャートが示す動作は、１６サンプリング周期毎に実行され、１トラック毎に１ブロック分のＰＣＭデータがまとめて出力される動作である。
データ伸長部２４の動作が開始されると、ステップＳ２０１において、トラック番号を示す変数ｎが「０」に初期化され、ステップＳ２０２に進み、フラグｆｌａｇ［ｎ］の値が判定される。
【００４７】
ステップＳ２０２において、フラグｆｌａｇ［ｎ］の値が「０」であると判定された場合には、ステップＳ２０３に進み、第ｎトラックのＰＣＭデータとして値「＄００００」が１６個出力部２５に送られ、ステップＳ２０８に進む。
ステップＳ１０２において、フラグｆｌａｇ［ｎ］の値が「１」であると判定された場合には、ステップＳ２０４に進み、第ｎトラックに対応する音声フレーズの、変数ｐｏｉｎｔ［ｎ］で示されるアドレスから始まる１ブロック分の圧縮データがすべて伸長されて出力部２５に送られ、ステップＳ２０５に進む。
【００４８】
ステップＳ２０５では、第ｎトラックに対応する音声フレーズの最終ブロックに到達したか否かが判定され、最終ブロックに到達したと判定された場合は、ステップＳ２０６に進み、フラグｆｌａｇ［ｎ］の値が「０」に設定され、これにより第ｎトラックが伸長処理停止中となってステップＳ２０８に進む。ステップＳ２０５において、最終ブロックに到達していないと判定された場合は、ステップＳ２０７に進み、変数ｐｏｉｎｔ［ｎ］の値が１ブロック分進められて、ステップＳ２０８に進む。
【００４９】
ステップＳ２０８では変数ｎの値が歩進されてステップＳ２０９に進み、変数ｎの値がトラックの数ＴＲＫに到達したか否かが判定される。変数ｎの値がトラックの数ＴＲＫに到達していれば動作を終了し、到達していなければステップＳ２０２に戻って、次のトラックについて上記の動作が繰り返される。
以上の動作によってデータ伸長部２４で作成されたＰＣＭデータが出力部２５に送られる。
【００５０】
図２に戻って説明を続ける。出力部２５には、１ブロック分のＰＣＭデータを記憶するメモリがデータ伸長部２４の各トラック毎に２面設けられており、一方の面のメモリにデータ伸長部２４から送られてきたＰＣＭデータが記憶されるとともに、他方の面のメモリに記憶されているＰＣＭデータがサンプリング周期毎に出力される。ＰＣＭデータが１６個出力される度に２つのメモリの使用形態が切り替えられ上記の動作が繰り返される。
【００５１】
楽音再生装置２には、さらに、アンプ２６と、スピーカ２７が備えられている。出力部２５から出力された複数トラックのＰＣＭデータは、アンプ２６によって増幅され、スピーカ２７から複数の音声フレーズが同時に発音される。なお、出力部２５においては伸長された全トラックの信号を加算した後に出力してもよい。この場合、圧縮データが伸長された状態ではデータの形式（この場合１６ビット）はそろっているので加算することは容易になる。
【００５２】
上記の各実施形態では、音声フレーズがモノラルである場合について説明したが、本発明のデータ圧縮装置、データ伸長装置および楽音再生装置は、音声フレーズがステレオであってもよい。この場合には、例えば、データ圧縮時に音声フレーズがステレオであるかモノラルであるかを示すデータが音声フレーズ毎に記憶され、データ伸長時に、ステレオであるかモノラルであるかを示すデータに基づいて、モノラルであれば１トラックでデータ伸長処理が行われ、ステレオであれば２トラックでデータ伸長処理が行われる。
【００５３】
また、上記の各実施形態ではサンプリング周波数がいずれの音声フレーズについても同じである場合について説明したが、複数のサンプリング周波数が用意され、データ圧縮時に音声フレーズ毎に最適なサンプリング周波数が選択され、そのサンプリング周波数を示すデータが音声フレーズに対応づけられて記憶され、データ伸長時に音声フレーズ毎のサンプリング周波数でデータ伸長処理が行われてもよい。
【００５４】
この場合、データ圧縮装置のメモリ１１にＰＣＭデータが記憶される際に、音声フレーズに対応するサンプリング周波数で記憶されてもよいし、メモリ１１に記憶される際にはすべての音声フレーズが同じサンプリング周波数で一旦記憶され、その後サンプリング周波数が変換されたデータに書き換えられてもよい。一般に、サンプリング周波数をＭ／Ｎ倍（Ｍ、Ｎはともに整数）に変換する場合には、以下の処理が順次行われる。
【００５５】
（第１処理）
各ＰＣＭデータ間にＭ−１個の、値が「０」のＰＣＭデータが挿入されるいわゆるＭ倍オーバサンプリングが行われる。
（第２処理）
元のサンプリング周波数、および変換後の、Ｍ／Ｎ倍になったサンプリング周波数のうちの低い方のサンプリング周波数に対応するナイキスト周波数以下の周波数成分を通過させるフィルタに、第１処理が施されたＰＣＭデータが通される。
【００５６】
（第３処理）
第２処理が施されたＰＣＭデータから、Ｎ個目毎にＰＣＭデータが抽出されるいわゆる１／Ｎ倍ダウンサンプリングが行われる。抽出されたＰＣＭデータ以外のＰＣＭデータは破棄される。
このようなサンプリング周波数の変換が施されたＰＣＭデータにデータ圧縮処理が施されて得られた圧縮データは、上述のように、音声フレーズ毎のサンプリング周波数でデータ伸長処理が行われて出力されるが、その際に各音声フレーズに対応づけられた各サンプリング周波数で出力されてもよいし、外部の機器の能力に応じた１つのサンプリング周波数に統一されるように再度サンプリング周波数の変換が行われて出力されてもよい。１つのサンプリング周波数に統一するためには、例えば以下のように処理を行なう。伸長処理装置を複数のサンプリング周波数の最小公倍数の周期ごとに動作させ、各トラックでは自分のトラックのフレーズのサンプリング周波数の周期（上記最小公倍数の周波数の周期の何回かに１回に）で読み出し・伸長を行ない、それ以外のタイミングでは「０」を出力し、上記最小公倍数の周波数の周期でフィルタ処理を行なう。また、１つのサンプリング周波数に統一されるように再度サンプリング周波数の変換が行われた後、各ＰＣＭデータが加算されて出力されてもよい。
【００５７】
【発明の効果】
以上説明したように、本発明のデータ圧縮装置によれば、単一のデータ圧縮アルゴリズムで、圧縮率が異なる複数の圧縮データを生成することができるので、これら複数の圧縮データの伸長処理は、単一のデータ伸長アルゴリズムに基づいて容易に行うことができる。
【００５８】
本発明のデータ伸長装置によれば、本発明のデータ圧縮装置によって生成された、圧縮率が異なる複数の圧縮データを単一のデータ伸長アルゴリズムに基づいて容易にデータ伸長することができる。
本発明のデータ伸長装置によれば、圧縮率が異なる複数の圧縮データを単一のデータ伸長アルゴリズムに基づいて容易にデータ伸長することができ、それら複数の圧縮データに基づく複数の音声フレーズを同時に再生することができる。
【図面の簡単な説明】
【図１】本発明のデータ圧縮装置の一実施形態を示すブロック図である。
【図２】本発明のデータ伸長装置の一実施形態を包含する楽音再生装置を示すブロック図である。
【図３】データ伸長部２４の動作を示すフローチャートである。
【図４】データ伸長部２４の別の動作を示すフローチャートである。
【符号の説明】
１データ圧縮装置
２楽音再生装置
１４データ圧縮部
１５パラメータ設定部
２３パラメータ取得部
２４データ伸長部
２６音源
２７アンプ
２８スピーカ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a data compression apparatus that performs data compression processing on phrase data composed of a series of audio data, and a data expansion apparatus that performs data expansion processing on compressed phrase data representing an audio phrase. In the present invention, the term “sound” refers to a concept including all sounds in the audible range such as musical sounds and sound effects in addition to human voices.
[0002]
[Prior art]
Conventionally, phrase data representing an audio phrase has been widely stored in a storage medium or transmitted / received via a communication medium. In this case, the compressed data is generated by performing data compression on the phrase data, and storing this compressed data or transmitting / receiving it is economical because the storage capacity and communication time can be reduced. In order to further improve the economy, it is better to perform a compression process having a higher compression rate. However, since the tone of music that is played back based on compressed data generated by compression processing with a higher compression ratio tends to be lower in quality, in order to maintain the tone quality of the played tone within an acceptable range. Needs to choose a compression process with a lower compression ratio. Therefore, a desired compression rate is determined by a balance between economy and sound quality, and a compression process having the desired compression rate is selected.
[0003]
[Problems to be solved by the invention]
By the way, there are voices (musical sounds) that require high sound quality, such as piano performance sounds, and other sounds that may have low sound quality, such as noise-like sound effects. Therefore, the desired compression rate determined by the balance between economic efficiency and sound quality varies depending on the type of audio. Therefore, it is possible to generate compressed data by selecting a compression process having a desired compression rate according to the type of audio. desirable.
[0004]
However, when a plurality of types of audio are reproduced based on a plurality of types of compressed data having different compression rates and representing a plurality of types of audio generated in this manner, the compression generated by each of the plurality of compressed data is generated. Since it is necessary to perform decompression processing corresponding to the processing on each of the plurality of compressed data, there are problems that the decompression processing is troublesome and that simultaneous reproduction is difficult.
[0005]
In view of the above circumstances, the present invention provides a data compression apparatus and a data expansion apparatus that can easily perform decompression processing and simultaneous reproduction even when a plurality of sounds are reproduced based on a plurality of compressed data having different compression rates. For the purpose.
[0006]
[Means for Solving the Problems]
The data compression device of the present invention that achieves the above object is a data compression device that performs data compression processing on phrase data consisting of a series of audio data, which represents an audio phrase.
A parameter association means for associating a plurality of parameters defining the compression rate with the audio phrase;
Data compression means comprising data compression means for compressing phrase data representing a voice phrase at a compression rate according to a plurality of parameters associated with the voice phrase using a data compression algorithm with a variable compression ratio To do.
[0007]
In the present invention, a voice phrase is composed of one sound or a series of a plurality of sounds, and phrase data is waveform data representing a sound waveform.
According to this data compression apparatus, a plurality of compressed data having different compression ratios can be generated by a single data compression algorithm. Therefore, decompression processing of the plurality of compressed data is based on a single data decompression algorithm. Can be done easily.
[0008]
In the data compression apparatus of the present invention, when the data compression means performs compression processing on the phrase data, the phrase data is divided into a predetermined number of continuous audio data, and each continuous division included in the division An index data part representing a data level reference common to the data level corresponding to each data level of the audio data is generated on the basis of the continuous audio data, and is divided into the categories with the reference data level as a unit. Generate a mantissa data part that represents each data level corresponding to the data level of the continuous audio data contained in the number of bits determined based on the parameter associated with the phrase data and the reference data level, and A series of audio data included in the It is preferable that the replacing data.
[0009]
Here, “the data level corresponding to the data level of each of the audio data” and “the data level corresponding to the data level of the audio data” may be the data level of each of the audio data, and the difference data between the audio data. This means that any data level can be used as long as the data level of the original audio data can be restored.
[0010]
In the data decompression apparatus of the present invention that achieves the above object, phrase data consisting of a series of voice data representing a voice phrase is compressed by a data compression algorithm with a variable compression ratio at a compression ratio corresponding to the voice phrase. In a data decompression device that performs data decompression processing on compressed phrase data,
Parameter acquisition means for acquiring a plurality of parameters that define the compression rate of the compressed phrase data corresponding to the compressed phrase data;
The compressed phrase data includes a data decompression algorithm corresponding to the data compression algorithm and data decompression means for decompressing data based on a plurality of parameters acquired by the parameter acquisition means.
[0011]
According to this data decompression device, a plurality of compressed data having different compression rates generated by the data compression device of the present invention can be easily decompressed based on a single data decompression algorithm.
In the data decompression apparatus of the present invention, the compressed phrase data represents a speech phrase, and the continuous speech data included in each division in which phrase data composed of continuous speech data is divided into a predetermined number of continuous speech data A data level corresponding to a data level corresponding to each data level, and an index data portion representing a data level standard, and a data level corresponding to the data level of continuous audio data included in the classification with the standard data level as a unit Each of which is compressed phrase data composed of compressed data composed of a mantissa data portion represented by a plurality of parameters associated with the phrase data and the number of bits determined based on a reference data level;
The decompression means obtains an exponent data part constituting the compressed data, and based on the parameter corresponding to the compressed phrase data including the compressed data and the exponent data part, calculates the number of bits of the mantissa part constituting the compressed data. It is preferable to obtain the mantissa data portion according to the number of bits and decompress the compressed data.
[0012]
In the data decompression apparatus of the present invention, the data decompression means includes a plurality of compressed phrase data corresponding to the data compression algorithm, a data decompression algorithm common to the plurality of compressed phrase data, and the parameter obtaining means. It is also preferable that the data is decompressed simultaneously based on a plurality of parameters obtained according to the acquired plurality of compressed phrase data.
[0013]
Furthermore, the data decompression apparatus of the present invention includes a plurality of compressed phrases in which phrase data that is a series of voice data representing a voice phrase is compressed with a compression ratio corresponding to the voice phrase by a data compression algorithm with a variable compression ratio. A parameter corresponding to each of the plurality of compressed phrase data, which defines a data decompression algorithm common to the plurality of compressed phrase data, and a compression rate of each of the plurality of compressed phrase data, corresponding to the data compression algorithm. And a data expansion means for expanding data simultaneously.
[0014]
According to this data decompression device, a plurality of compressed data with different compression rates can be easily decompressed based on a single data decompression algorithm, and a plurality of audio phrases based on the plurality of compressed data are reproduced simultaneously. be able to.
[0015]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described.
FIG. 1 is a block diagram showing an embodiment of a data compression apparatus of the present invention.
The data compression apparatus 1 includes a memory 11, an input unit 12, and a block formation unit 13. The memory 11 stores 16-bit PCM data in 2's complement format, which is the audio data referred to in the present invention, for a plurality of audio phrases, and the PCM data is read from the memory 11 by the input unit 12 to form a block. The block forming unit 13 forms a block which is a section referred to in the present invention for every 16 pieces of PCM data.
[0016]
The input unit 12 is not necessarily limited to the present embodiment, and PCM data may be read from an A / D converter or read from an external storage device such as a hard disk. However, it may be received from a communication line.
In addition, the data compression apparatus 1 includes a data compression unit 14, a parameter setting unit 15, an output unit 16, and a memory 17, and parameters necessary for compression processing described later are set by the parameter setting unit 15. It is arbitrarily set for each voice phrase by the user. Based on this parameter, the data compression unit 14 applies data compression processing to be described later to the PCM data sent from the block forming unit 13 one block at a time, and the compressed data generated as a result is one block. Each minute is sent to the output unit 16. Hereinafter, the data compression process will be described in detail with reference to Table 1.
[0017]
[Table 1]

[0018]
16 PCM data are shown in the second column from the left in Table 1, and in this embodiment, one block is formed by 16 PCM data as described above. Table 1 shows absolute values, mantissas, and decompressed data corresponding to each of 16 pieces of PCM data. The PCM data, absolute value, mantissa and decompressed data are all shown in hexadecimal notation. Hereinafter, the values displayed in hexadecimal notation are distinguished by attaching the symbol “$” to the head. A set of PCM data, absolute value, mantissa and decompressed data associated with each other is hereinafter referred to as a sample, and the 16 samples shown in Table 1 are numbered from 0 to 15. Has been.
[0019]
A procedure for creating compressed data based on one block of PCM data shown in Table 1 will be described below. First, PCM data having the maximum absolute value is searched from these PCM data. In the example of Table 1, this corresponds to PCM data of the 15th sample.
Next, the least significant sign bit of the PCM data of the 15th sample having the maximum absolute value is obtained. When the PCM data value “$ F8C4” of the 15th sample is expressed in binary notation, it becomes “1111100011000100”. As described above, since PCM data in 2's complement format is used in the present embodiment, this PCM data is negative, and is searched sequentially from the MSB, and 0 appears first (if the PCM data is positive) The first bit before the bit position (where 1 appears first) is the least significant sign bit. Hereinafter, when the LSB is referred to as the 0th bit and is referred to as the first bit, the second bit, etc. in order toward the MSB side, the least significant bit of the code bit is the 11th bit in the above example. If the value of the PCM data is either “$ 0000” or “$ FFFF”, the values of all bits are the same, and in the above procedure, the least significant sign bit is not found. The 0th bit is the least significant bit of the sign bit. A value indicating the least significant bit position of the sign bit is hereinafter referred to as a reference index. The reference index represents the maximum volume of the audio data in one block. In the example of Table 1, the reference index is “11”. Since the PCM data is 16-bit data and the reference index is any value from 0 to 15, in this embodiment, the reference index is represented by 4 bits.
[0020]
For each of the 16 PCM data, the mantissa is extracted from the bit position indicated by the reference exponent in the LSB direction by the number of bits determined by the method described below, and compression for one block is performed with the reference exponent and the 16 mantissas. Data is generated. A value indicating the bit position from which the least significant part of the mantissa is extracted is hereinafter referred to as an effective exponent. In the example of Table 1, since the reference exponent is “11”, if the mantissa is 7 bits, the effective exponent is “5”.
[0021]
In order to maintain the tone quality of the musical tone, it is necessary to increase the bit length of the mantissa when the volume of the musical tone is high, but when the volume of the musical tone is low, reducing the bit length of the mantissa will adversely affect the tone quality of the musical tone. It has been experimentally understood that it does not reach. Therefore, for example, in the case of a high volume musical tone, the bit length of the mantissa is set to 8 bits, and the bit length of the mantissa is shortened to 7 bits and 6 bits as the volume decreases, so that the bit length of the mantissa is fixed. A compression rate higher than the compression rate in the data compression processing in which the bit length of the mantissa is fixed can be realized while maintaining the sound quality equivalent to or higher than the sound quality in the data compression processing.
[0022]
In this embodiment, as a parameter for calculating the bit length of the mantissa, the maximum bit length max that is the bit length of the mantissa when the reference exponent is “15”, and the bit length of the mantissa as the reference exponent decreases. A step stp indicating the degree of shortening is set, and the bit length bit of the mantissa is determined for each block according to the reference exponent exp according to the following calculation formula.
[0023]
bit = max- (15-exp) / stp
However, the operation “x / y” is an operation for obtaining an integer part of a value obtained by dividing x by y.
Also, depending on the type of musical sound, the lower quality bits of the PCM data are ignored from the beginning, and the sound quality may be maintained even if a lower limit is set for the effective exponent. Therefore, in this embodiment, the minimum effective index min is set as a parameter, and the effective index (exp− (bit−1)) obtained from the bit length bit of the mantissa obtained as a result of the above calculation and the reference index exp is the minimum effective. When it is smaller than the exponent min, the bit length bit of the mantissa is determined by the following calculation formula.
[0024]
bit = exp-min + 1
Of course, since the actual mantissa bit length cannot be a negative value, if the calculation result of the mantissa bit length bit becomes a negative value, the bit length of the mantissa is set to “0”. , Compressed data with only an exponent.
Tables 2 and 3 show examples of mantissa bit lengths and effective exponents obtained from the reference exponent based on the maximum bit length max, the step stp, and the minimum effective exponent min.
[0025]
[Table 2]

[0026]
[Table 3]

[0027]
In the example of Table 2, the maximum bit length max is set to “6”, the step stp is set to “3”, and the bit length of the mantissa is shortened by “1” every time the reference index becomes “3”. Since the minimum effective exponent min is set to “4”, the bit length of the mantissa is “2” when the reference exponent is “5”, and the bit length of the mantissa is “1” when the reference exponent is “4”. The effective index is maintained at “4”.
[0028]
In the example of Table 3, the maximum bit length max is set to “8”, the step stp is set to “4”, and the mantissa bit length is shortened by “1” every time the reference exponent is decreased by “4”. In addition, since the minimum effective index min is set to “0”, there is a mantissa even when the reference index is “0”, and the sound quality is higher than that in the example of Table 2.
[0029]
The mantissa shown in Table 1 is a mantissa obtained under the conditions of Table 3. Since the reference exponent is “11” as described above, the number of bits of the mantissa is “7”. That is, the 11th to 5th bits are extracted as mantissas. The case where the mantissa is extracted from the PCM data of the 15th sample will be described as an example. As described above, the PCM data of the 15th sample is “1111100011000100” in binary notation. When up to 5 bits are extracted, “1000110” is obtained. If this is converted to hexadecimal notation, it will be “$ 46” as shown in Table 1.
[0030]
As described above, the compression rate and the sound quality are optimized by adjusting each parameter prior to the compression process. In addition, by changing the parameters, compressed data with different compression rates is generated based on a single data compression algorithm, so decompression processing is performed on each of the generated compressed data with different compression rates. Is easy.
[0031]
In addition to the parameters described above, a parameter ext indicating the width of the exponent where the bit length of the mantissa is max is set, and when the value of 15-ext is smaller than the reference exponent exp, the bit length bit of the mantissa is maximum. When the bit length is max and the value of 15-ext is equal to or greater than the reference exponent exp, the bit length bit of the mantissa may be determined by the following calculation formula.
[0032]
bit = max-1- (15-ext-exp) / stp
Here, the operation “x / y” is an operation for obtaining an integer part of a value obtained by dividing x by y, as described above.
When a user designates a compression target phrase and starts compression processing via command means not shown in FIG. 1, the output unit 16 shown in FIG. 1 sets parameter settings prior to compression processing. Each parameter set by the unit 15 is stored in the header area of the memory 17, and after starting the compression process, the compressed data generated by the data compression unit 14 is sequentially stored in the compressed data area of the memory 17. As for the last block of the musical phrase, if the last PCM data of the voice phrase is not the last data of the block, all the mantissa data after that of the last block indicate 0. In this embodiment, the compressed storage area is divided into an area for storing the reference exponent and an area for storing the mantissa. This is because if the memory is composed of 16 bits per word, the base index is 4 bits, so the base index for 4 blocks fits exactly in 1 word of memory, and the mantissa is 16 blocks for 1 block. Therefore, the data length for one block is an integer multiple of 16 bits regardless of the bit length, and thus data management becomes easier if the areas are divided in this way.
[0033]
Note that the memory arrangement is not necessarily limited to the arrangement of the present embodiment, and it is natural that the parameters, the reference exponent, and the mantissa may be stored successively in the same area.
When the compression processing of a series of PCM data representing one voice phrase is completed, the output unit 16 stores the reference index constituting the compressed data of the first block of the voice phrase and indicating the address of the reference index area Data indicating the address of the compressed data area in which the first mantissa of the compressed data of the first block of the speech phrase is stored, and the mantissa constituting the compressed data of the last block of the speech phrase Data indicating the address of the compressed data area in which the first mantissa is stored and the mantissa corresponding to the last PCM data of the speech phrase, the relative address within the block containing the PCM data (block Is stored in the header area of the audio phrase in the memory 17. . By setting parameters for each voice phrase and instructing compression processing to start, compressed data is generated for each voice phrase and a parameter that defines the compression ratio used when generating the compressed data; The compressed data is associated.
[0034]
In this embodiment, the parameter is stored in the header area of the memory 17 and automatically transferred to the data decompression device. However, the user records the parameter and inputs it to the data decompression device during the data decompression process. May be.
The output unit 16 need not be limited to the present embodiment, and may be one that writes compressed data to an external storage device such as a hard disk, or one that sends out to a communication line.
[0035]
FIG. 2 is a block diagram showing a musical sound reproducing apparatus including an embodiment of the data decompression apparatus of the present invention.
The musical sound reproducing device 2 includes a memory 21, an input unit 22, and a parameter acquisition unit 23. The compressed data generated by the data compression device 1 shown in FIG. Stored in the format. The input unit 22 starts to operate in response to a data decompression process start command given by the user via a command unit (not shown). Prior to the decompression process, the input unit 22 performs decompression processing from the header area of the memory 21. Data such as a parameter corresponding to the voice phrase to be read is read and sent to the parameter acquisition unit 23. After starting the decompression, compressed data is read from the compressed data area of the memory 21 by the input unit 22. The parameter acquisition unit 23 extracts parameters and the like corresponding to the speech phrase that is the target of the decompression process from the data sent from the input unit 22.
[0036]
The musical sound reproducing device 2 includes a data decompression unit 24 and an output unit 25, and the memory 21 to the output unit 25 correspond to an embodiment of the data decompression device of the present invention. The data decompression unit 24 can decompress a plurality of compressed phrase data representing a plurality of audio phrases in parallel by time-division processing. This is because the data decompression unit 24 has a plurality of tracks. Each of the plurality of tracks corresponds to one audio phrase to be decompressed, and each track is the same as performing data decompression of compressed phrase data representing the audio phrase corresponding to the track. . In the following description, it is assumed that such a track exists. Also, for each of these tracks, the user is instructed to start and stop an arbitrary phrase expansion process via command means (not shown). Hereinafter, the principle of the extension process by each track will be described.
[0037]
Prior to the start of the decompression process, each track receives a parameter corresponding to the audio phrase corresponding to that track from the parameter acquisition unit 23, and after the decompression starts, this parameter and the reference index sent from the input unit 22 Based on the above, the bit length of the mantissa is calculated for each block.
When the bit length of the mantissa is “0”, 16 pieces of PCM data having a value of “$ 0000” are generated by the track as decompressed data and sent to the output unit 25. If the bit length of the mantissa is “0”, the data decompression unit 24 does not generate decompressed data, but informs the output unit 25 that the bit length of the mantissa is “0”. 25 may generate decompressed data.
[0038]
When the bit length of the mantissa part is positive, 16 mantissas are acquired via the input unit 22 according to the bit length. Each mantissa is stored right-justified in each of the 16 16-bit registers, each significand's MSB is copied to all the high-order bits, and sign extension is performed, all in accordance with the effective exponent determined by the parameters and reference exponent. If the register is shifted to the left and the effective exponent is not “0”, the bit immediately below the mantissa LSB is set to 1 (see the fourth column of Table 1). The last process corresponds to truncation during compression. For example, in the example of Table 1, the effective index is “5”, and in the case of the 11th sample, the PCM data is “$ FBFD”. Since the mantissa is “$ 5F”, if this final process is not performed, the decompressed data is “$ FBE0” and the truncation error is “$ 001D”. As shown in FIG. 1, “$ FBF0” is obtained, and as a result, the truncation error is suppressed to “$ 000D”.
[0039]
As a result of performing the decompression process in this way, the compressed data is decompressed, and 16 pieces of PCM data are formed as decompressed data and supplied to the output unit 25. Thereafter, similarly to this, the compressed data of each block is expanded.
The operation of the data decompression unit 24 will be described below with reference to the flowchart.
FIG. 3 is a flowchart showing the operation of the data decompression unit 24.
[0040]
When the user instructs the start of the decompression process for the nth track, a flag flag [n] indicating the processing status of the track is set to a value “1” indicating that the decompression process is in progress by a process not shown. The compressed data of the first block of the audio phrase corresponding to the nth track is configured in the variable point [n] indicating the address where the first mantissa of the compressed data for one block is stored. An address at which the first mantissa of the mantissa is stored is substituted, and a variable count [n] indicating a relative address in the block (which indicates how many data is from the head of the block) is set to “0”. When the user instructs the n-th track to stop the expansion process, the flag flag [n] is set to a value “0” indicating that the expansion process is stopped by a process not shown.
[0041]
The data decompression unit 24 performs the operation shown in the flowchart for each clock cycle of the sampling frequency.
When the data decompression unit 24 starts the operation, in step S101, the variable n indicating the track number is initialized to “0”, the process proceeds to step S102, and the value of the flag flag [n] is determined.
[0042]
If it is determined in step S102 that the value of the flag flag [n] is “0”, the n-th track is stopped, and the process proceeds to step S103, where the value is set as the PCM data of the n-th track. “$ 0000” is sent to the output unit 25, and the process proceeds to step S111.
If it is determined in step S102 that the value of the flag flag [n] is “1”, the nth track is being expanded, and the process proceeds to step S104, where the mantissa determined from the reference exponent and the parameter is set. Based on the bit length, the decompressed data based on the count [n] th mantissa from the head of the block in the mantissa of one block whose head mantissa is stored at the address indicated by the variable point [n]. Is sent to the output unit 25, and the process proceeds to step S105 to determine whether or not the end of the compressed phrase data representing the audio phrase corresponding to the nth track has been reached. In this determination, point [n] is compared with the address of the compressed data area in which the first mantissa of the compressed data of the final block of the speech phrase is stored, and count [n] is compared with the speech. This is done by comparing the mantissa corresponding to the final PCM data of the phrase with the relative address in the block containing the PCM data.
[0043]
If it is determined in step S105 that the end of the compressed phrase data has been reached, the process proceeds to step S106, where the value of the flag flag [n] is set to “0”, so that the nth track is not being decompressed. The process proceeds to step S111.
If it is determined in step S105 that the end of the compressed phrase data has not been reached, the process proceeds to step S107, the value of the variable count [n] is incremented, and the process proceeds to step S108.
[0044]
In step S108, it is determined whether or not the value of the variable count [n] has become “16”, and thereby it is determined whether or not data for one block has been output. If it is determined in step S108 that the value of the variable count [n] is not “16”, it means that one block of data has not been output, and the process proceeds to step S111 without doing anything. . If it is determined in step S108 that the value of the variable count [n] has become “16”, this means that one block of data has been output. Therefore, in step S109, the value of the variable count [n] Set to “0” and proceed to step S110, the bit length of the mantissa is obtained based on the reference exponent and the parameter, the value of the variable point [n] is advanced by one block according to the bit length of the mantissa, and the next block The reference index constituting the compressed data is acquired, and the process proceeds to step S111.
[0045]
In step S111, the value of the variable n is incremented and the process proceeds to step S112, where it is determined whether or not the value of the variable n has reached the number of tracks TRK. If the value of the variable n has reached the number of tracks TRK, the operation is terminated. If not, the operation returns to step S102, and the above operation is repeated for the next track.
FIG. 4 is a flowchart showing another operation of the data decompression unit 24.
[0046]
The operation shown in the flowchart of FIG. 3 is an operation that is executed every sampling cycle and one PCM data is output for each track. However, the operation shown in this flowchart is executed every 16 sampling cycles, This is an operation in which one block of PCM data is output together.
When the operation of the data decompression unit 24 is started, a variable n indicating a track number is initialized to “0” in step S201, and the process proceeds to step S202, where the value of the flag flag [n] is determined.
[0047]
If it is determined in step S202 that the value of the flag flag [n] is “0”, the process proceeds to step S203, and 16 values “$ 0000” are sent to the output unit 25 as PCM data of the nth track. The process proceeds to step S208.
If it is determined in step S102 that the value of the flag flag [n] is “1”, the process proceeds to step S204, where the address of the voice phrase corresponding to the nth track is indicated by the variable point [n]. All the compressed data for one block starting is decompressed and sent to the output unit 25, and the process proceeds to step S205.
[0048]
In step S205, it is determined whether or not the final block of the audio phrase corresponding to the nth track has been reached. If it is determined that the final block has been reached, the process proceeds to step S206, and the value of the flag flag [n] is set. As a result, the nth track is stopped and the process proceeds to step S208. If it is determined in step S205 that the final block has not been reached, the process proceeds to step S207, the value of the variable point [n] is advanced by one block, and the process proceeds to step S208.
[0049]
In step S208, the value of the variable n is incremented and the process proceeds to step S209, where it is determined whether or not the value of the variable n has reached the number of tracks TRK. If the value of the variable n has reached the number of tracks TRK, the operation is terminated. If not, the operation returns to step S202, and the above operation is repeated for the next track.
The PCM data created by the data decompression unit 24 by the above operation is sent to the output unit 25.
[0050]
Returning to FIG. 2, the description will be continued. The output unit 25 has two memories for storing one block of PCM data for each track of the data decompression unit 24, and the PCM data sent from the data decompression unit 24 to the memory on one side Is stored, and the PCM data stored in the memory on the other side is output every sampling period. Each time 16 pieces of PCM data are output, the usage mode of the two memories is switched and the above operation is repeated.
[0051]
The music sound reproducing device 2 is further provided with an amplifier 26 and a speaker 27. The PCM data of a plurality of tracks output from the output unit 25 is amplified by the amplifier 26, and a plurality of audio phrases are simultaneously generated from the speaker 27. Note that the output unit 25 may add the signals of all the expanded tracks and output them. In this case, in a state where the compressed data is expanded, the data formats (in this case, 16 bits) are complete, so that it is easy to add.
[0052]
In each of the above embodiments, the case where the audio phrase is monaural has been described. However, in the data compression device, the data decompression device, and the music sound reproduction device of the present invention, the audio phrase may be stereo. In this case, for example, data indicating whether the audio phrase is stereo or monaural is stored for each audio phrase at the time of data compression, and based on data indicating whether the audio phrase is stereo or monaural at the time of data decompression. In the case of monaural, data expansion processing is performed in one track, and in the case of stereo, data expansion processing is performed in two tracks.
[0053]
In each of the above embodiments, the case where the sampling frequency is the same for any audio phrase has been described. However, a plurality of sampling frequencies are prepared, and an optimal sampling frequency is selected for each audio phrase during data compression. Data indicating the sampling frequency may be stored in association with the audio phrase, and the data expansion processing may be performed at the sampling frequency for each audio phrase at the time of data expansion.
[0054]
In this case, when PCM data is stored in the memory 11 of the data compression device, it may be stored at a sampling frequency corresponding to the voice phrase, or when stored in the memory 11, all voice phrases are sampled at the same sampling rate. The data may be temporarily stored in the frequency, and then rewritten to the data in which the sampling frequency is converted. Generally, when the sampling frequency is converted to M / N times (M and N are both integers), the following processing is sequentially performed.
[0055]
(First process)
So-called M-times oversampling is performed in which M−1 pieces of PCM data having a value of “0” are inserted between each PCM data.
(Second process)
A PCM in which a first process is applied to a filter that passes a frequency component equal to or lower than the Nyquist frequency corresponding to the lower sampling frequency of the original sampling frequency and the converted sampling frequency multiplied by M / N Data is passed.
[0056]
(Third process)
So-called 1 / N times downsampling is performed in which PCM data is extracted every Nth piece from the PCM data subjected to the second processing. PCM data other than the extracted PCM data is discarded.
The compressed data obtained by subjecting the PCM data subjected to such sampling frequency conversion to data compression processing is output after being subjected to data decompression processing at the sampling frequency for each audio phrase as described above. However, it may be output at each sampling frequency associated with each voice phrase at that time, or the sampling frequency is converted again so that it is unified to one sampling frequency according to the capability of the external device. May be output. In order to unify to one sampling frequency, for example, processing is performed as follows. Operate the decompression device at the least common multiple cycles of multiple sampling frequencies, and read each track at the sampling frequency cycle of the track's own phrase (once every one of the least common multiple frequency cycles). -Decompression is performed, "0" is output at other timings, and filter processing is performed at the frequency cycle of the least common multiple. Further, after conversion of the sampling frequency is performed again so as to be unified to one sampling frequency, each PCM data may be added and output.
[0057]
【The invention's effect】
As described above, according to the data compression apparatus of the present invention, a plurality of compressed data with different compression rates can be generated with a single data compression algorithm. It can be easily done based on a single data decompression algorithm.
[0058]
According to the data decompression apparatus of the present invention, it is possible to easily decompress the plurality of compressed data generated by the data compression apparatus of the present invention and having different compression ratios based on a single data decompression algorithm.
According to the data decompression apparatus of the present invention, it is possible to easily decompress a plurality of compressed data having different compression ratios based on a single data decompression algorithm, and simultaneously to a plurality of audio phrases based on the plurality of compressed data. Can be played.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an embodiment of a data compression apparatus of the present invention.
FIG. 2 is a block diagram showing a musical sound reproducing apparatus including an embodiment of the data decompression apparatus of the present invention.
FIG. 3 is a flowchart showing the operation of the data decompression unit 24;
FIG. 4 is a flowchart showing another operation of the data decompression unit 24;
[Explanation of symbols]
1 Data compression device
2 Musical sound playback device
14 Data compression unit
15 Parameter setting section
23 Parameter acquisition unit
24 Data decompression section
26 Sound source
27 Amplifier
28 Speaker

Claims

In a data compression device for generating compressed phrase data composed of compressed data consisting of an exponent data part and a mantissa data part by performing data compression processing on phrase data consisting of a series of voice data representing a voice phrase,
Parameter association means for associating a plurality of parameters defining a correspondence relationship between the number of bits of the mantissa data portion with respect to the exponent data portion to the voice phrase;
Using a data compression algorithm in which the number of bits of the mantissa data portion is variable, the phrase data representing the speech phrase is compressed using the mantissa data portion having the number of bits corresponding to a plurality of parameters associated with the speech phrase. Data compression means for
When the data compression means performs compression processing on the phrase data, the data level of each continuous audio data included in the division is divided into each division obtained by dividing the phrase data into a predetermined number of continuous audio data. The index data portion representing the maximum data level is generated, and the data level corresponding to the data level of the continuous audio data included in the section is associated with the phrase data with the maximum data level as a reference. A mantissa data part represented by the number of bits determined to be equal or larger as the maximum data level is larger based on the parameter and the maximum data level, and a series of data included in the partition Replace audio data with compressed data consisting of the exponent data part and the mantissa data part. Data compression apparatus characterized by.

In a data decompression device that performs data decompression processing on compressed phrase data obtained by compressing phrase data consisting of a series of speech data that represents a speech phrase,
The compressed phrase data represents a voice phrase. Phrase data consisting of a series of voice data is divided into a predetermined number of pieces of continuous voice data. An exponent data portion representing a data level, and compressed data including a mantissa data portion representing each of the data levels corresponding to the data levels of continuous audio data included in the section with the maximum data level as a reference The mantissa data portion is based on a plurality of parameters associated with the phrase data and the maximum data level to define a correspondence relationship between the exponent data portion and the number of bits of the mantissa data portion. Therefore, the larger the maximum data level is, the larger or the same is determined. Be one having a bet number,
Parameter acquisition means for acquiring a plurality of parameters corresponding to the compressed phrase data;
An exponent data part constituting the compressed data is obtained, and the compressed data is constituted based on the parameter obtained by the parameter obtaining unit corresponding to the compressed phrase data including the compressed data, and the exponent data part. Data decompression means for obtaining the number of bits of the mantissa part, obtaining the mantissa data part according to the number of bits, and decompressing the compressed data, thereby decompressing the compressed phrase data. Data decompression device.

The plurality of compressed phrase data obtained by the data decompression means, the data decompression algorithm corresponding to the data compression algorithm, the data decompression algorithm common to the plurality of compressed phrase data, and the parameter obtaining means. 3. The data decompression apparatus according to claim 2, wherein data decompression is performed simultaneously based on a plurality of parameters corresponding to each.