JP4193240B2

JP4193240B2 - Compressed encoded data decoding apparatus and karaoke apparatus using the same

Info

Publication number: JP4193240B2
Application number: JP26757798A
Authority: JP
Inventors: 茂本間
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 1998-09-22
Filing date: 1998-09-22
Publication date: 2008-12-10
Anticipated expiration: 2018-09-22
Also published as: JP2000099091A

Abstract

PROBLEM TO BE SOLVED: To enable performing the decoding of data from the arbitrary position of compression-encoded data by reproducing the data received from the arbitrary frame of the compression-encoded data based on decode parameters in a header after receiving the header. SOLUTION: Decode parameters and Twin VQ frame data are transmitted from a sequencer 11 and when the decode parameters are transmitted, they are temporarily stored in a memory 60. An acoustic data processing part 8 receives the decode parameters, that is, data including a sampling rate and a bit rate before it reproduces karaoke music to store them in the memory. In this state, the part 8 decodes the Twin VQ frame data from the sequencer 11 based on these parameters. When the parameters are temporarily stored in the memory, even when the position of a pointer on a sound control track is changed by a fast forwarding operation or the rewinding operation or the like thereafter, the part 8 can perform the decoding of data from a frame data corresponding to that position.

Description

【０００１】
【発明の属する技術分野】
この発明は、圧縮符号化された音響データの復号装置及び同装置を用いたカラオケ装置に関し、特に音声データ等の音響データをベクトル量子化法によって圧縮符号化した装置に関する。
【０００２】
【従来の技術】
音声信号等の音響信号を高能率で圧縮符号化する手法として、ベクトル量子化法が提案されている。例えば、ＴｗｉｎＶＱ（ＮＴＴ社によって開発された音響信号の圧縮符号化法：Transform-domain Weighted Interleave Vector Quantization）と称されるベクトル量子化法は、圧縮符号化対象となる信号を一定区間で切り出し、切り出した各情報パターンをインターリーブして複数のターゲットベクトルを作成し、このターゲットベクトルに対してコードブック探索を行って最も近いパターンベクトルのインデックスを伝送する。復号時には、圧縮時に使用したコードブックと同じものを用いて元の信号を復元する。このベクトル量子化法では、コードブック探索によって得られたコードを圧縮符号とするから、音質レベルを高く維持したまま圧縮率を高めることができ、また、信号を一定区間で切り出す固定長フレーム方式を採用しているために符号誤りに強い特徴がある。そこで、ＴｗｉｎＶＱに代表されるベクトル量子化法による圧縮符号化方式は、特に、音楽信号や音声信号などの音響信号の圧縮に使用されている。
【０００３】
なお、コードブックを用いてベクトル量子化する圧縮符号化方式については、例えば、特開平１０−１１２６５７号公報等に示されている。
【０００４】
従来、上記の圧縮符号化方式によって得られるデータは、図９に示すようにヘッダ１とデータ本体２とからなり、ヘッダ１は復号時（デコード時）に必要なサンプルレートやビットレートからなるデコード時に必要なデコードパラメータからなり、データ本体２はビットストリームからなる圧縮符号化データからなっている。
【０００５】
デコード時には、まずヘッダ１を獲得し、この中のデコードパラメータを解釈してからデータ本体２の復号化操作を行う。
【０００６】
【発明が解決しようとする課題】
しかし、従来の復号装置では、ヘッダ１とデータ本体２とを常に一体のものとして認識し、デコード時には、最初に必ずヘッダ１内のデコードラメータを解釈してからデータ本体２のデコードを行うようにしていた。このために、例えば、データ本体２の途中までデコードして一旦停止し、ある時間経過後に再びデータ本体２の任意の部分からデコードしたり、または、データ本体２の途中までデコードした後、他のデータ本体の途中からデコード開始をする、といった操作を行うことができなかった。すなわち、データ途中からのデコードを行うことができないために、圧縮符号化データの応用的な使用ができないという不都合があった。
【０００７】
この発明の目的は、圧縮符号化データの任意の位置からデコードを行うことのできる圧縮符号化データの復号装置を提供することにある。
【０００８】
また、この発明の他の目的は、上記復号装置を採用することによって、ＭＩＤＩが苦手とする音声信号や楽器音信号等の応用的な操作が容易になるカラオケ装置を提供することにある。
【０００９】
【課題を解決するための手段】
請求項１の発明は、コードブックを用いてベクトル量子化された圧縮符号化データをデコードするためのデコードパラメータを含むヘッダと、該圧縮符号化データを一定のフレーム毎に分割した状態で記憶するベクトル量子化データ記憶手段と、該ベクトル量子化データ記憶手段に記憶されている圧縮符号化データを再生するデコード手段とを備え、
前記デコード手段は、
前記ベクトル量子化データ記憶手段に記憶されている圧縮符号化データを再生するときに、該ベクトル量子化データ記憶手段から前記ヘッダに含まれるデコードパラメータを受信して一時的に記憶するデコードパラメータ記憶手段と、前記デコードパラメータ記憶手段に前記デコードパラメータが記憶されている状態で、前記ベクトル量子化データ記憶手段に記憶されている圧縮符号化データをその途中のフレームから受信して該受信したデータを前記デコードパラメータ記憶手段に記憶されているデコードパラメータに基づいて再生する再生部とを備えることを特徴とする。
【００１０】
この発明の圧縮符号化データは、コードブックを用いて圧縮対象となる信号をベクトル量子化することによって得る。コードブックを用いるベクトル量子化手法には、ＴｗｉｎＶＱがある。ＴｗｉｎＶＱは、圧縮対象となる音響信号等を一定区間毎に切り出してインターリーブすることによって複数のターゲットベクトルを作成し、このターゲットベクトルに最も近いパターンベクトルをコードブックから選択してそのコードを伝送するようにしたものである。復号時（デコード時）には、上記と同じコードブックを用いることによってコードに対応するターゲットベクトルを復元しＤＡＣを通すことによって元の信号を得る。ベクトル量子化データ記憶手段は、このようにしてベクトル量子化された圧縮符号化データを記憶するものであって、ＣＤ−ＲＯＭ等の記憶媒体やインターネットあるいはＩＳＤＮなどの通信ラインを介して得られる。デコード手段は、上記ベクトル量子化データを復号する時に、最初にヘッダを受信してデコードパラータを一時的に記憶しておく。その後、ベクトル量子化データの任意のフレームから受信して、一時記憶しているデコードパラメータに基づいてデコード（復号）する。この発明では、ヘッダと圧縮符号化データとを別々に取り扱うと共に、圧縮符号化データを一定のフレーム毎に分割しているから、最初に１度デコードパラメータを受けておくことによって、その後、任意のフレーム位置から圧縮符号化データの複合を可能とするものである。したがって、一旦ヘッダを受信してデコードパラメータを解釈して記憶しておけば、その後、圧縮符号化データの最初のフレームからはもちろん、途中のフレームからも復号化ができ、また、途中フレームから復号した後、任意のフレーム位置にジャンプして復号することも可能である。
【００１１】
なお、デコードパラメータは、サンプルレートとビットレートで構成される。ベクトル量子化手法によっては、これ以外の情報を含むことも可能である。要するに、ヘッダには、圧縮符号化データを復号するために必要な情報がすべて含まれていればよい。
【００１２】
請求項３の発明は、上記圧縮符号化データの復号装置が用いられ、楽音トラックなどのシーケンスデータをシーケンサ部により再生するカラオケ装置であって、
前記ベクトル量子化データ記憶手段は、デコードパラメータを含むヘッダと、ベクトル量子化されてフレーム分割された音響シーケンスデータを記憶し、
前記デコード手段は、カラオケ再生前にヘッダのデコードパラメータを獲得して前記デコードパラメータ記憶手段に記憶し、カラオケ再生時に音響シーケンスデータの中のシーケンサ部で指定された任意のフレームから前記デコードパラメータに基づいて再生することを特徴とする。
この発明は、上記圧縮符号化データの復号装置を用いたカラオケ装置において、圧縮符号化の対象を音声や楽器音などを含む音響のシーケンスデータとしたものである。そして、カラオケ再生前にヘッダのデコードパラメータを獲得しておいて、カラオケ再生時に音響シーケンスデータの中のシーケンサ部で指定された任意のフレームからデコードパラメータに基づいて再生する。
【００１３】
例えば、このカラオケ装置に早送りと巻き戻しを指定する手段をリモコン装置等に設けることによって、早送りや巻き戻しが指定された時、シーケンサ部は早送りや巻き戻しの量に基づいて音響シーケンスデータの中の途中フレームを指定する。この時、事前にデコードパラメータが獲得できているために、指定された途中フレームから直接再生することが可能になる。また、複数の楽曲データによるメドレー演奏を指定する手段を備えることによって、最初にデコードパラメータが獲得されているために、メドレー曲の２曲目以降を再生する時には、その２曲目以降の音響シーケンスデータの中のメドレー部分の途中フレームを直接指定することができる。
【００１４】
【発明の実施の形態】
図１は、この発明の実施形態であるカラオケ装置の概略構成図である。このカラオケ装置は、ハードディスク１２及びＤＶＤ（デジタル・ビデオ・ディスク）４０にカラオケ曲演奏用の楽曲データや背景映像用の動画データを記憶しており、利用者がリモコン装置３を用いて曲番号を入力すると、その曲番号の楽曲データを読み出してカラオケ曲を演奏する。リモコン装置は早送りキー３ａ、巻き戻しキー３ｂ、一時停止キー３ｃを備え、これらのキーをオンすることによって、演奏中のカラオケ曲を早送り、巻き戻し、一時停止させることができる。
【００１５】
このカラオケシステムは、カラオケ装置本体１のほか、コントロールアンプ２、リモコン装置３、ＤＶＤチャンジャ４、スピーカ５、モニタ６、マイク７及び音響データ処理部８で構成されている。
【００１６】
カラオケ装置本体１は、システム全体の動作を制御する制御部１０、カラオケ演奏を実行するシーケンサ１１、楽曲データなど記憶したハードディスク（ＨＤＤ）１２、リモコン装置３が発信する赤外線信号を受信してデコードするリモコン受信部１３、楽曲データに基づいて楽音を発生する音源装置１４、歌詞の文字パターンなど展開するパターン展開部１６及び歌詞や背景映像などの表示を制御する表示制御部１７を備えている。
【００１７】
音響データ処理部８は、ＭＩＤＩデータでは再生できないような音声や、楽器音などの音響信号を処理する。このカラオケ装置では、ハードディスク１２に記憶されている楽曲データに含まれる音響制御トラックで指定されるＴｗｉｎＶＱデータファイルのＴｗｉｎＶＱデータが入力する。音響データ処理部８では、このＴｗｉｎＶＱデータをデコードして、ＤＡ変換した後コントロールアンプ２に出力する。
【００１８】
音源装置１４が発生した楽音及び音響データ処理部８で復号したコーラス音等の音声や楽器音等はコントロールアンプ２に入力する。コントロールアンプ２は、これらの楽音信号、音声信号、楽器音信号及びマイク７からの歌唱音声信号等をミキシングして効果を付与してスピーカ５に出力する。
【００１９】
ＤＶＤチェンジャ４は複数枚のＤＶＤ４０のいずれかを選択して読出・再生する。ＤＶＤにはＭＰＥＧ圧縮された動画背景映像が記憶されている。背景映像には、特定のカラオケ曲に個別に対応する個別背景映像やカラオケ曲の種別（例えばジャンル）毎に設けられた汎用の背景映像などがある。このＤＶＤチェンジャ４及びパターン展開部１６は、表示制御部１７に接続されている。表示制御部１７はＤＶＤチャンジャ４から入力される背景映像の上にパターン展開部１６が展開した文字パターンなどをスーパーインポーズ合成しモニタ６に表示する。
【００２０】
ハードディスクディスク１２には多数のカラオケ曲の楽曲データが記憶されているほか、静止画像データ等も記憶されている。
【００２１】
なお、図１の機能ブロック図で示したカラオケ装置は、ＣＰＵを含むコンピュータシステムで構成され、シーケンサ１１やパターン展開部１６はソフト的に実現される。
【００２２】
図２は、ハードディスクディスク１２やＤＶＤ４０に記憶されている楽曲データの構成を示す図である。このカラオケ装置では、楽曲データは、ヘッダ、トラック群、音響データ部で構成されている。前述のように、音響データは、ＴｗｉｎＶＱデータからなる。ヘッダには、曲番号、タイトルなどの書誌データや、圧縮符号化データであるＴｗｉｎＶＱデータのデコードパラメータが書き込まれている。トラック群は、音源装置１４を制御する楽音トラックのほか、パターン展開部１６が文字パターンに展開する歌詞データが書き込まれた歌詞トラック、フレーズ毎に分割された音響データをどのタイミングで再生するかを制御する音響制御トラック、音源装置１４やコントロールアンプ２のエフェクト
（効果）を制御する効果制御トラックなどで構成されている。
【００２３】
楽音トラックは、ノートイベントデータや設定データなどのイベントデータと各イベントデータの読み出しタイミングを示すタイミングデータを時系列に配列して構成されている。タイミングデータは、各イベントデータ間の時間的間隔を示すデュレーションや、曲がスタートしてから各イベントデータが発生するまでの絶対時間などのデータで記述される。シーケンサ１１は、タイミングデータで指示されるタイミングにイベントデータを読み出して音源装置１４に入力する。音源装置１４は、入力されたイベントデータに応じて楽音を発生する。
【００２４】
音響制御トラックは、イベントデータであるＴｗｉｎＶＱデータ番号と各イベントデータの読み出しタイミングを示すタイミングデータを時系列に配列して構成される。シーケンサ１１は、タイミングデータで指定されるタイミングにＴｗｉｎＶＱデータ番号を読み出し、この番号で指定されるＴｗｉｎＶＱデータをファイルから読み出し、このデータ内の該当するデータフレームから音響データ処理部８に出力する。なお、後述のように実際にはシーケンサ１１は、ＴｗｉｎＶＱのフレームを指定し、そのフレームに対応するデータが音響データ処理部８に出力される。
【００２５】
歌詞トラック、効果制御トラックについても上記楽音トラックや音響制御トラックと同様に、イベントデータと各イベントデータの読み出しタイミングを示すタイミングデータを時系列に配列して構成される。
【００２６】
リモコン３は、上述のように早送りキー３ａ、巻き戻しキー３ｂ、一時停止キー３ｃを備えており、このキーが操作されることによって演奏中のカラオケ曲の早送りや巻き戻しが可能になる。すなわち、カラオケ曲が演奏されている時に、早送りキー３ａを操作すると、その操作されている間カラオケ曲の早送りが行われる。カラオケ曲の早送りとは、シーケンサ１１での正方向（時間経過方向）への歩進スピードを速めることである。また、巻き戻しキー３ｂが操作されると、その操作されている間カラオケ曲の巻き戻しが行われる。カラオケ曲の巻き戻しとは、シーケンサ１１での歩進方向を逆方向にすると共にその歩進スピードを通常の歩進スピードよりも速めることである。図３に、シーケンサ１１での歩進動作を示す。ポインタＰはシーケンサ１１で現在処理しているトラック上の位置を示すものであって、通常のカラオケ曲の再生時には一定のスピードで右方向（時間経過方向）に歩進している。今カラオケ曲が再生されている状態で位置ａ１で早送りキー３ａが操作されると、ポインタＰが早送り用に設定された歩進スピードで右方向に進む。ポインタＰが位置ａ２に達した時に早送りキー３ａが離されると、この位置ａ２から通常の歩進スピードでのカラオケ曲の再生が再開される。また、ポインタＰが位置ａ３の位置に達した時に巻き戻しキーが操作されると、巻き戻し用に設定された歩進スピードで左方向（時間経過と逆方向）に進んでいく。ポインタＰが位置ａ４に達した時に巻き戻しキー３ｂが離されると、この時から再び右方向に通常のスピードでポインタＰが歩進していく。すなわち、通常のカラオケ曲の再生が再開される。このように、早送りキー３ａと巻き戻しキー３ｂのいずれかが操作された時には、シーケンサ１１での歩進スピード及び歩進方向を制御することによってポインタＰの位置を先の任意の位置に進めたり元の任意の位置に戻したりすることが自由にできる。
【００２７】
早送りや巻き戻しの時に、各トラックの再生をその時の歩進スピードに合わせて行うことも可能であるが、巻き戻しの場合にはイベントデータの解釈を逆にするなどの制御が必要になってくる。例えば、楽音トラックではノートオンのイベントデータの時にはノートオフを実行し、ノートオフのイベントデータの時にはノートオンを実行することが必要である。音響制御トラックについては、この実施形態では早送り時巻き戻し時共に再生をしない。別途、「キュルキュルキュル」のような擬似音を出すことも可能である。
【００２８】
図４は、ＴｗｉｎＶＱの作成部の機能ブロック図である。このＴｗｉｎＶＱデータは図外のＣＤ−ＲＯＭやＩＳＤＮ通信回線によって楽曲データの一部として送られ、ハードディスク１２等に記憶される。したがって、ＴｗｉｎＶＱデータ自体はカラオケ装置で作成されるものではなく楽曲データの一部としてインプリメントされるものである。図４に示すようにＴｗｉｎＶＱデータの作成においては、対象となる音響信号をデジタル化したものをまずメモリ５０に展開し、制御部５１の制御によって、ベクトル量子化部５２において所定のサンプリングレート及びビットレートに基づいてベクトル量子化された圧縮符号化データ、すなわちＴｗｉｎＶＱフレームデータを生成する。ベクトル量子化にはコードブック５３を用いる。前述のように、ＴｗｉｎＶＱフレームデータを得るためのベクトル量子化は以下のようにして行う。すなわち、メモリ５０に記憶されている音響データを一定区間毎に切り出してインターリーブし、複数のターゲットベクトルを作成すると共に、この各ターゲットベクトルに対し、コードブック５３を参照して最も近いパターンベクトルを選ぶ。その時のコードをＴｗｉｎＶＱフレームデータとして出力する。なお、ＴｗｉｎＶＱフレームデータはフレーム化されたものとなる。また、この実施形態では、フレームの大きさがバイト単位となるように、ビットレート等から決定されるフレーム長に無効ビットを加える（ｐａｄｄｉｎｇ処理）ようにしている。バイト単位でフレーム長を扱えるようにすることで、デコード時に途中フレームから再生開始するときフレーム指定が容易になる。また、デコードのためにこの時に用いたサンプリングレート及びビットレートをデコードパラメータとして出力する。デコードパラメータは楽曲データのヘッダ部に挿入され、ベクトル量子化されたＴｗｉｎＶＱフレームデータは音響データとして用いられる。
【００２９】
図５は、ＴｗｉｎＶＱデータを復号化するための音響データ処理部の機能ブロック図である。シーケンサ１１からはデコードパラメータやＴｗｉｎＶＱフレームデータが送られてきて、デコードパラメータの場合にはメモリ６０に一時的に記憶される。また、ＴｗｉｎＶＱフレームデータの時にはベクトル逆量子化部６１において逆量子化を行うことによって復号化が行われる。すなわち、ＴｗｉｎＶＱデータを作成する時に使用したコードブック５３と同じコードブック６２を用いることによってベクトル逆量子化部６１で元の信号に復号化することができる。６３は、これらの制御を行う部分であって、ＤＡＣ６４は、復号化されたデータをアナログ信号に変換してコントロールアンプ２に出力する信号を作成する。
【００３０】
音響データ処理部８では、シーケンサ１１から、カラオケ曲再生前にデコードパラメータ、すなわちサンプリングレートとビットレートを含むデータ（図２の楽曲データのヘッダ部に含まれている）を受信してメモリ６０に記憶しておく。この状態で、シーケンサ１１から送られてくるＴｗｉｎＶＱフレームデータを該デコードパラメータに基づいて復号化する。メモリ６０に、デコードパラメータが一旦記憶されると、その後、早送りや巻き戻しなどによって、音響制御トラック上のポインタＰの位置が変わっても、その位置に対応するフレームデータから復号化することができる。すなわち、ポインタＰの位置が変わった時に、もう一度ヘッダ部から受信する必要がない。
【００３１】
図６は、ＴｗｉｎＶＱデータの構成を示し、早送り及び巻き戻しが行われた時のデータフレームの再生例を示している。最初にヘッダ部のサンプリングレートとビットレートがメモリ６０に一旦記憶された後、データフレームの１番から２番まで再生され、早送りキーがデータフレームｉ−１の所まで押されると、続いてデータフレームｉの再生が行われ、同フレームｉの再生が終わった時に巻き戻しキー３ｂがデータフレーム２の最初の位置まで操作されると再びデータフレーム２からの再生がスタートする。なお、早送りキーや巻き戻しキーの操作終了位置があるフレームの中途であれば、その次に再生開始となるフレーム番号は当該フレームの次のフレームからとなる。このように、一旦ヘッダのサンプリングレートとビットレートが音響データ処理部８に送られていれば、以後早送りキー３ａや巻き戻しキー３ｂが操作されても、操作終了時点から直ちに音響データの再生が可能となる。なお、早送りや巻戻しのときに、再生開始フレームは、トラックのデュレーションデータとイベントが指定するＴｗｉｎＶＱデータの１フレーム時間に基づいて計算する。すなわち、早送りキーや巻戻しキーの操作された時間分ポインタを移動させるとき、ポインタはシーケンストラック上と１ファイルのＴｗｉｎＶＱデータ上を移動させ、その移動合計がキー操作時間となるように制御する。
【００３２】
図７は、上記カラオケ装置の一部動作を示すフローチャートである。特に、このフローチャートではリモコン３の早送りキー３ａや巻き戻しキー３ｂがカラオケ演奏中に操作された時の音響データの制御について示している。
【００３３】
カラオケ演奏の初期設定を行うタイミングでは（ｎ１）、音源装置１４や表示制御部１７等の初期設定のほか、音響データ処理部８に対してサンプリングレート及びビットレートを含むデコードパラメータを送る。これらのデータは、音響データ処理部８においてメモリ６０に記憶される。
【００３４】
カラオケ演奏が始まると、シーケンサ１１は再生モードとなって（ｎ２）、各トラックのシーケンス処理を行う（ｎ３）。カラオケ演奏中にリモコンキー３によって早送りキー３ａが操作されると（ｎ４）、早送りが停止されるまでポインタＰの歩進速度をＮ倍にして歩進する（ｎ６）。早送りが停止した時にｎ７に進んでその時にポインタが指定しているデータフレームをスタート位置とする。また、巻き戻しキー３ｂが操作されると（ｎ８）、巻き戻しが停止するまでポインタＰのＮ倍の速度で逆歩進が行われ（ｎ１０）、巻き戻しが停止した時にｎ１１に進み、その時のポインタで指定しているデータフレームをスタート位置として設定する。
【００３５】
以上の動作によって、最初の初期設定の段階でデコードパラメータが音響データ処理部８に記憶されるために、早送りキーや巻き戻しキーが操作された時には最初からデータを再送しなくても所望のデータ再生を直ちに行うことができる。なお、以上の実施形態では早送りキーと巻き戻しキーが操作された時についての構成と動作を示したが、カラオケ装置において、複数の楽曲のメドレー演奏が指定された時にも各楽曲の音響データについては途中フレームから再生することができる。例えば、図８に示すように楽曲１の音響データについてはデータフレーム１、２を再生し、楽曲２についてはデータフレーム２、３を、楽曲３についてもデータフレーム２、３を再生する場合、楽曲１のヘッダのサンプリングレート及びビットレートを音響データ処理部のメモリ６０に記憶しておくことによって、楽曲２ではヘッダを読み直さなくても、データフレーム２から直ちに再生をすることができ、また、楽曲３においてもデータフレーム２から直ちに再生を開始することができる。この場合各楽曲の音響データはすべて同じコードブックを用いて且つ同じサンプリングレートとビットレートを用いて量子化されていることが前提である。
【００３６】
【発明の効果】
請求項１の発明によれば、初めにデコードパラメータを受信しているために、任意のフレームから再生することができ、圧縮符号化データの種々の応用的な利用が可能となる。
【００３７】
また、この圧縮符号化データ復号装置をカラオケ装置に採用することよって、音声や楽器音などＭＩＤＩでは処理のできない音響シーケンスデータを途中から再生したりすることが自由にできる。例えば、早送り機能と巻き戻し機能を採用したカラオケ装置では、早送りした位置または巻き戻しした位置から直ちに音響シーケンスデータの再生を行うことができ、また、メドレー演奏機能を採用したカラオケ装置では、２曲目以降のメドレー曲の音響シーケンスデータを途中フレームから直ちに再生することができる。
【図面の簡単な説明】
【図１】この発明の実施形態であるカラオケ装置の機能ブロック図
【図２】楽曲データの構成図
【図３】シーケンサの保進動作を示す図
【図４】ＴｗｉｎＶＱデータの作成部の機能ブロック図
【図５】音響データ処理部の機能ブロック図
【図６】ＴｗｉｎＶＱデータの構成を示す図
【図７】カラオケ装置の一部動作を示すフローチャート
【図８】メドレー演奏時のＴｗｉｎＶＱデータ処理法を示す図
【図９】従来の圧縮符号化データの構成例[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a compression-encoded audio data decoding apparatus and a karaoke apparatus using the same, and more particularly to an apparatus that compresses and encodes audio data such as audio data by a vector quantization method.
[0002]
[Prior art]
A vector quantization method has been proposed as a technique for compressing and encoding an acoustic signal such as an audio signal with high efficiency. For example, a vector quantization method called TwinVQ (Transform-domain Weighted Interleave Vector Quantization developed by NTT) cuts out a signal to be compressed and encoded in a certain interval. Each information pattern is interleaved to create a plurality of target vectors, and a code book search is performed on the target vectors to transmit the index of the nearest pattern vector. At the time of decoding, the original signal is restored using the same codebook used at the time of compression. In this vector quantization method, the code obtained by codebook search is used as a compression code, so the compression rate can be increased while maintaining a high sound quality level, and a fixed-length frame method that cuts out a signal in a certain interval is used. Because it is adopted, it is strong against code errors. Therefore, a compression coding method based on a vector quantization method typified by TwinVQ is used particularly for compression of an acoustic signal such as a music signal or a voice signal.
[0003]
Note that a compression coding method that performs vector quantization using a codebook is disclosed in, for example, Japanese Patent Laid-Open No. 10-112657.
[0004]
Conventionally, the data obtained by the above-described compression coding method is composed of a header 1 and a data body 2 as shown in FIG. 9, and the header 1 is a decoding composed of a sample rate and a bit rate necessary for decoding (decoding). The data body 2 is composed of compression-encoded data composed of a bit stream.
[0005]
At the time of decoding, first, the header 1 is obtained, the decoding parameters in this are interpreted, and then the decoding operation of the data body 2 is performed.
[0006]
[Problems to be solved by the invention]
However, in the conventional decoding device, the header 1 and the data body 2 are always recognized as one body, and when decoding, the decoding parameter in the header 1 is always first interpreted before the data body 2 is decoded. It was. For this reason, for example, the data body 2 is decoded halfway and temporarily stopped, and after a certain period of time, it is decoded again from an arbitrary part of the data body 2, or after being decoded partway through the data body 2, Operations such as starting decoding from the middle of the data itself could not be performed. That is, since decoding cannot be performed from the middle of the data, there is a disadvantage that the compressed encoded data cannot be used in an applied manner.
[0007]
An object of the present invention is to provide a compression-encoded data decoding apparatus capable of decoding from an arbitrary position of compression-encoded data.
[0008]
Another object of the present invention is to provide a karaoke apparatus that facilitates an applied operation of a voice signal, a musical instrument sound signal, or the like that MIDI is not good at by adopting the decoding device.
[0009]
[Means for Solving the Problems]
According to the first aspect of the present invention, a header including a decoding parameter for decoding compression-encoded data vector-quantized using a codebook, and the compressed-encoded data are stored in a state of being divided into fixed frames. A vector quantized data storage means; and a decoding means for reproducing the compressed encoded data stored in the vector quantized data storage means,
The decoding means includes
Decode parameter storage means for receiving and temporarily storing the decode parameter included in the header from the vector quantized data storage means when reproducing the compressed encoded data stored in the vector quantized data storage means And receiving the compressed encoded data stored in the vector quantized data storage means from the intermediate frame in a state where the decode parameters are stored in the decode parameter storage means, and receiving the received data And a playback unit for playing back based on the decode parameters stored in the decode parameter storage means .
[0010]
The compression-coded data of the present invention is obtained by vector quantization of a signal to be compressed using a code book. There is TwinVQ as a vector quantization method using a codebook. TwinVQ creates a plurality of target vectors by cutting out and compressing an acoustic signal or the like to be compressed at fixed intervals, selects a pattern vector closest to the target vector from the codebook, and transmits the code. It is a thing. At the time of decoding (decoding), the same signal as above is used to restore the target vector corresponding to the code, and the original signal is obtained by passing through the DAC. The vector quantized data storage means stores the compressed encoded data that has been vector quantized in this way, and is obtained via a storage medium such as a CD-ROM, or a communication line such as the Internet or ISDN. When decoding the vector quantized data, the decoding means first receives the header and temporarily stores the decoding parameters. Thereafter, it is received from an arbitrary frame of vector quantized data, and is decoded (decoded) based on the temporarily stored decoding parameters. In the present invention, the header and the compression encoded data are handled separately, and the compression encoded data is divided into fixed frames. It enables compression-coded data to be combined from the frame position. Therefore, once the header is received and the decoding parameters are interpreted and stored, it is possible to decode not only from the first frame of the compressed encoded data but also from the intermediate frame, and from the intermediate frame. After that, it is possible to jump to an arbitrary frame position for decoding.
[0011]
Note that the decode parameter is composed of a sample rate and a bit rate. Depending on the vector quantization method, other information may be included. In short, the header only needs to include all information necessary for decoding the compression-encoded data.
[0012]
The invention of claim 3 is a karaoke apparatus in which the compression-encoded data decoding device is used, and sequence data such as a musical sound track is reproduced by a sequencer unit.
The vector quantized data storage means stores a header including decoding parameters, and acoustic sequence data that has been vector quantized and divided into frames,
The decoding means acquires a decoding parameter of a header before karaoke reproduction and stores it in the decoding parameter storage means, and based on the decoding parameter from an arbitrary frame designated by a sequencer unit in acoustic sequence data at the time of karaoke reproduction And playing.
According to the present invention, in a karaoke apparatus using the above-described decoding apparatus for compression-encoded data, the object of compression-encoding is acoustic sequence data including voices, instrument sounds, and the like. Then, the header decoding parameters are acquired before karaoke reproduction, and reproduction is performed based on the decoding parameters from an arbitrary frame designated by the sequencer unit in the acoustic sequence data during karaoke reproduction.
[0013]
For example, by providing the karaoke device with a means for specifying fast forward and rewind in the remote control device or the like, when the fast forward or rewind is designated, the sequencer unit includes the acoustic sequence data based on the amount of fast forward or rewind. Specify a frame in the middle of. At this time, since the decoding parameters can be acquired in advance, it is possible to directly reproduce from the designated intermediate frame. In addition, since the decoding parameter is first acquired by providing means for designating a medley performance by a plurality of music data, when reproducing the second and subsequent music of the medley music, the acoustic sequence data of the second and subsequent music You can directly specify the middle frame of the medley part.
[0014]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 1 is a schematic configuration diagram of a karaoke apparatus according to an embodiment of the present invention. In this karaoke device, song data for karaoke song performance and moving image data for background video are stored in a hard disk 12 and a DVD (digital video disc) 40, and the user uses the remote control device 3 to specify a song number. When input, the music data of the music number is read and a karaoke music is played. The remote control device includes a fast-forward key 3a, a rewind key 3b, and a pause key 3c. By turning on these keys, the karaoke song being played can be fast-forwarded, rewinded, and paused.
[0015]
The karaoke system includes a karaoke device main body 1, a control amplifier 2, a remote control device 3, a DVD changer 4, a speaker 5, a monitor 6, a microphone 7, and an acoustic data processing unit 8.
[0016]
The karaoke apparatus body 1 receives and decodes an infrared signal transmitted from a control unit 10 that controls the operation of the entire system, a sequencer 11 that performs karaoke performance, a hard disk (HDD) 12 that stores music data, and the remote control device 3. A remote control receiving unit 13, a sound source device 14 that generates musical sounds based on music data, a pattern development unit 16 that develops character patterns of lyrics, and a display control unit 17 that controls the display of lyrics and background video are provided.
[0017]
The sound data processing unit 8 processes sound signals such as sound that cannot be reproduced by MIDI data and instrument sounds. In this karaoke apparatus, TwinVQ data of a TwinVQ data file designated by an acoustic control track included in music data stored in the hard disk 12 is input. The acoustic data processing unit 8 decodes the TwinVQ data, performs DA conversion, and outputs it to the control amplifier 2.
[0018]
The musical sound generated by the sound source device 14, the voice such as the chorus sound decoded by the acoustic data processing unit 8, the instrument sound, and the like are input to the control amplifier 2. The control amplifier 2 mixes these musical sound signals, sound signals, instrument sound signals, singing sound signals from the microphone 7, etc., gives an effect, and outputs them to the speaker 5.
[0019]
The DVD changer 4 selects and reads / reproduces one of the plurality of DVDs 40. The DVD stores a moving picture background image compressed by MPEG. The background video includes an individual background video individually corresponding to a specific karaoke song, a general-purpose background video provided for each karaoke song type (for example, genre), and the like. The DVD changer 4 and the pattern development unit 16 are connected to the display control unit 17. The display control unit 17 superimposes the character pattern developed by the pattern development unit 16 on the background video input from the DVD changer 4 and displays it on the monitor 6.
[0020]
The hard disk 12 stores music data of a large number of karaoke songs, as well as still image data and the like.
[0021]
The karaoke apparatus shown in the functional block diagram of FIG. 1 is configured by a computer system including a CPU, and the sequencer 11 and the pattern development unit 16 are realized by software.
[0022]
FIG. 2 is a diagram showing the composition of music data stored in the hard disk 12 or DVD 40. As shown in FIG. In this karaoke apparatus, music data is composed of a header, a track group, and an acoustic data section. As described above, the acoustic data is composed of TwinVQ data. In the header, bibliographic data such as a music number and a title, and decoding parameters of TwinVQ data which is compression-encoded data are written. In addition to the musical sound track that controls the sound source device 14, the track group indicates the lyric track in which the lyric data developed by the pattern development unit 16 into the character pattern is written, and the timing at which the acoustic data divided for each phrase is reproduced. The control unit includes an acoustic control track to be controlled, an effect control track for controlling effects of the sound source device 14 and the control amplifier 2, and the like.
[0023]
The musical sound track is configured by arranging event data such as note event data and setting data and timing data indicating the read timing of each event data in time series. The timing data is described by data such as a duration indicating a time interval between the event data and an absolute time from when the music starts to when each event data is generated. The sequencer 11 reads the event data at the timing indicated by the timing data and inputs it to the sound source device 14. The tone generator 14 generates a musical sound according to the input event data.
[0024]
The acoustic control track is configured by arranging, in time series, TwinVQ data numbers that are event data and timing data indicating the read timing of each event data. The sequencer 11 reads the TwinVQ data number at the timing specified by the timing data, reads the TwinVQ data specified by this number from the file, and outputs it from the corresponding data frame in this data to the acoustic data processing unit 8. As will be described later, the sequencer 11 actually designates a TwinVQ frame, and data corresponding to the frame is output to the acoustic data processing unit 8.
[0025]
The lyrics track and the effect control track are configured by arranging event data and timing data indicating the read timing of each event data in chronological order, similarly to the musical tone track and the sound control track.
[0026]
As described above, the remote controller 3 includes the fast forward key 3a, the rewind key 3b, and the pause key 3c. By operating these keys, the karaoke song being played can be fast forwarded or rewinded. That is, when the fast-forward key 3a is operated while a karaoke song is being played, the karaoke song is fast-forwarded while the karaoke song is being operated. The fast-forwarding of the karaoke song is to increase the stepping speed in the forward direction (time passage direction) in the sequencer 11. Further, when the rewind key 3b is operated, the karaoke song is rewinded during the operation. The rewinding of the karaoke song means that the stepping direction in the sequencer 11 is reversed and the stepping speed is made faster than the normal stepping speed. FIG. 3 shows a stepping operation in the sequencer 11. The pointer P indicates the position on the track that is currently processed by the sequencer 11 and advances rightward (in the direction of time) at a constant speed during normal karaoke song playback. If the fast-forward key 3a is operated at the position a1 while the karaoke song is being reproduced, the pointer P moves rightward at the stepping speed set for fast-forwarding. When the fast-forward key 3a is released when the pointer P reaches the position a2, the reproduction of the karaoke song at the normal stepping speed is resumed from the position a2. Further, when the rewind key is operated when the pointer P reaches the position a3, it proceeds in the left direction (opposite to the passage of time) at the stepping speed set for rewinding. When the rewind key 3b is released when the pointer P reaches the position a4, the pointer P advances again at a normal speed rightward from this time. That is, normal karaoke music is resumed. In this way, when either the fast-forward key 3a or the rewind key 3b is operated, the position of the pointer P is advanced to an arbitrary position by controlling the step speed and the step direction in the sequencer 11. It can be freely returned to the original arbitrary position.
[0027]
When fast-forwarding or rewinding, it is possible to play back each track according to the stepping speed at that time, but in the case of rewinding, control such as reversing the interpretation of the event data is necessary. come. For example, in a musical tone track, it is necessary to execute note-off for note-on event data and to execute note-on for note-off event data. In this embodiment, the acoustic control track is not reproduced at the time of fast-forwarding and rewinding. Separately, it is also possible to make a pseudo sound such as “Kyurukurukuru”.
[0028]
FIG. 4 is a functional block diagram of the TwinVQ creation unit. The TwinVQ data is sent as a part of music data by a CD-ROM or ISDN communication line (not shown) and stored in the hard disk 12 or the like. Accordingly, the TwinVQ data itself is not created by the karaoke apparatus, but is implemented as a part of the music data. As shown in FIG. 4, in creating TwinVQ data, a digitalized target acoustic signal is first developed in the memory 50, and the vector quantization unit 52 controls the predetermined sampling rate and bit under the control of the control unit 51. Compression-encoded data that is vector-quantized based on the rate, that is, TwinVQ frame data is generated. A code book 53 is used for vector quantization. As described above, vector quantization for obtaining TwinVQ frame data is performed as follows. That is, the acoustic data stored in the memory 50 is cut out for every predetermined section and interleaved to create a plurality of target vectors, and the nearest pattern vector is selected with reference to the code book 53 for each target vector. . The code at that time is output as TwinVQ frame data. Note that TwinVQ frame data is framed. In this embodiment, invalid bits are added to the frame length determined from the bit rate or the like (padding processing) so that the frame size is in bytes. By making it possible to handle the frame length in units of bytes, it becomes easy to specify a frame when playback is started from an intermediate frame during decoding. Also, the sampling rate and bit rate used at this time for decoding are output as decoding parameters. The decoding parameter is inserted into the header portion of the music data, and the TwinVQ frame data subjected to vector quantization is used as acoustic data.
[0029]
FIG. 5 is a functional block diagram of an acoustic data processing unit for decoding TwinVQ data. Decode parameters and TwinVQ frame data are sent from the sequencer 11 and are temporarily stored in the memory 60 in the case of decode parameters. In the case of TwinVQ frame data, the vector inverse quantization unit 61 performs inverse quantization to perform decoding. That is, by using the same code book 62 as the code book 53 used when creating TwinVQ data, the vector inverse quantization unit 61 can decode the original signal. Reference numeral 63 denotes a part for performing these controls, and the DAC 64 converts the decoded data into an analog signal and creates a signal to be output to the control amplifier 2.
[0030]
The acoustic data processing unit 8 receives from the sequencer 11 the decoding parameters before playback of the karaoke song, that is, data including the sampling rate and bit rate (included in the header portion of the song data in FIG. 2). Remember. In this state, the TwinVQ frame data sent from the sequencer 11 is decoded based on the decoding parameter. Once the decoding parameter is stored in the memory 60, even if the position of the pointer P on the acoustic control track changes due to fast-forwarding or rewinding, it can be decoded from the frame data corresponding to that position. . That is, when the position of the pointer P changes, there is no need to receive it again from the header part.
[0031]
FIG. 6 shows the structure of TwinVQ data and shows an example of data frame reproduction when fast forward and rewind are performed. First, the sampling rate and bit rate of the header part are temporarily stored in the memory 60, then the data frame is reproduced from No. 1 to No. 2, and when the fast forward key is pressed to the data frame i-1, the data When the playback of the frame i is performed and the rewind key 3b is operated to the first position of the data frame 2 when the playback of the frame i is finished, the playback from the data frame 2 starts again. If the operation end position of the fast forward key or rewind key is in the middle of a frame, the frame number at which playback is started next is from the next frame of the frame. As described above, once the header sampling rate and bit rate are sent to the acoustic data processing unit 8, even if the fast-forward key 3a and the rewind key 3b are operated thereafter, the acoustic data is immediately reproduced from the end of the operation. It becomes possible. Note that at the time of fast forward or rewind, the playback start frame is calculated based on the duration data of the track and one frame time of TwinVQ data specified by the event. That is, when the pointer is moved by the time during which the fast forward key or rewind key is operated, the pointer is moved on the sequence track and on the TwinVQ data of one file, and the total movement is controlled to be the key operation time.
[0032]
FIG. 7 is a flowchart showing a partial operation of the karaoke apparatus. In particular, this flowchart shows control of acoustic data when the fast-forward key 3a and the rewind key 3b of the remote controller 3 are operated during a karaoke performance.
[0033]
At the timing of initial setting of the karaoke performance (n1), in addition to the initial settings of the sound source device 14 and the display control unit 17, the decoding parameters including the sampling rate and the bit rate are sent to the acoustic data processing unit 8. These data are stored in the memory 60 in the acoustic data processing unit 8.
[0034]
When the karaoke performance starts, the sequencer 11 enters the reproduction mode (n2) and performs the sequence processing of each track (n3). When the fast-forward key 3a is operated by the remote control key 3 during the karaoke performance (n4), the stepping speed of the pointer P is increased N times until the fast-forward is stopped (n6). When fast-forwarding stops, the process proceeds to n7, and the data frame designated by the pointer at that time is set as the start position. Further, when the rewind key 3b is operated (n8), reverse stepping is performed at a speed N times that of the pointer P until rewinding stops (n10), and when rewinding stops, the process proceeds to n11. The data frame specified by the pointer is set as the start position.
[0035]
With the above operation, since the decoding parameters are stored in the acoustic data processing unit 8 at the initial initial setting stage, the desired data can be obtained without retransmitting the data from the beginning when the fast forward key or the rewind key is operated. Regeneration can be done immediately. In the above embodiment, the configuration and the operation when the fast forward key and the rewind key are operated are shown. However, in the karaoke apparatus, the medley performance of a plurality of music is designated, and the acoustic data of each music is also specified. Can be played from the middle frame. For example, as shown in FIG. 8, data frames 1 and 2 are reproduced for the acoustic data of music 1, data frames 2 and 3 are reproduced for music 2, and data frames 2 and 3 are reproduced for music 3. By storing the sampling rate and bit rate of the header 1 in the memory 60 of the acoustic data processing unit, the music 2 can be immediately reproduced from the data frame 2 without re-reading the header. The reproduction of the music piece 3 can also be started immediately from the data frame 2. In this case, it is assumed that the acoustic data of each musical piece is quantized using the same codebook and the same sampling rate and bit rate.
[0036]
【The invention's effect】
According to the first aspect of the present invention, since the decoding parameter is received first, it can be reproduced from an arbitrary frame, and various applied uses of the compression-encoded data are possible.
[0037]
Further, by adopting this compression-encoded data decoding apparatus in a karaoke apparatus, it is possible to freely reproduce sound sequence data such as voice and musical instrument sound that cannot be processed by MIDI from the middle. For example, in a karaoke device that employs a fast-forward function and a rewind function, the acoustic sequence data can be immediately reproduced from the fast-forward position or the rewind position, and in the karaoke device that employs a medley performance function, The subsequent medley music sequence data can be reproduced immediately from the middle frame.
[Brief description of the drawings]
FIG. 1 is a functional block diagram of a karaoke apparatus according to an embodiment of the present invention. FIG. 2 is a block diagram of music data. FIG. 3 is a diagram showing a storage operation of a sequencer. FIG. 5 is a functional block diagram of an acoustic data processing unit. FIG. 6 is a diagram showing a configuration of TwinVQ data. FIG. 7 is a flowchart showing a partial operation of a karaoke apparatus. FIG. 8 is a TwinVQ data processing method during medley performance. FIG. 9 shows a configuration example of conventional compression-encoded data.

Claims

A header including a decoding parameter for decoding the compression-encoded data vector-quantized using the codebook, and vector-quantized data storage means for storing the compressed-encoded data in a state of being divided into predetermined frames; A decoding means for reproducing the compressed encoded data stored in the vector quantized data storage means,
The decoding means includes
Decode parameter storage means for receiving and temporarily storing the decode parameter included in the header from the vector quantized data storage means when reproducing the compressed encoded data stored in the vector quantized data storage means And receiving the compressed encoded data stored in the vector quantized data storage means from the intermediate frame in a state where the decode parameters are stored in the decode parameter storage means, and receiving the received data A decoding device for compression-encoded data, comprising: a reproduction unit that reproduces data based on the decode parameter stored in the decode parameter storage means .

2. The apparatus for decoding compression-encoded data according to claim 1, wherein the decoding parameters are a sampling rate and a bit rate.

A karaoke device that uses the decoding device for compression-encoded data according to claim 1 or 2 and reproduces sequence data such as a musical sound track by a sequencer unit,
The vector quantized data storage means stores a header including decoding parameters, and acoustic sequence data that has been vector quantized and divided into frames,
The decoding means acquires a decoding parameter of a header before karaoke reproduction and stores it in the decoding parameter storage means, and based on the decoding parameter from a frame in the middle designated by a sequencer unit in acoustic sequence data at the time of karaoke reproduction A karaoke device characterized by being played back.

4. The karaoke apparatus according to claim 3, further comprising means for designating fast forward and rewind, wherein the sequencer unit designates a halfway frame in the acoustic sequence data based on an amount of fast forward and rewind.

5. The karaoke apparatus according to claim 3, further comprising means for designating a medley performance based on a plurality of music data, wherein the sequencer unit designates an intermediate frame in the second and subsequent sound sequence data during the medley performance.