JP4294179B2

JP4294179B2 - Waveform playback device

Info

Publication number: JP4294179B2
Application number: JP28789899A
Authority: JP
Inventors: 忠男菊本; 敬之富沢; 智日下部
Original assignee: Roland Corp
Current assignee: Roland Corp
Priority date: 1999-10-08
Filing date: 1999-10-08
Publication date: 2009-07-08
Anticipated expiration: 2019-10-08
Also published as: JP2001109477A

Description

【０００１】
【発明の属する技術分野】
本発明は、いわゆる位相ボコーダを使用して時間軸の圧縮伸長を行う波形再生装置に関する。
【０００２】
【従来の技術】
一般に、位相ボコーダでは、オーディオ波形信号を、この信号の概略基本周期を帯域幅に持つ複数の周波数帯域に分割し、各帯域の信号をそれぞれの帯域の中心の複素周波数で乗算して、振幅値と瞬間周波数とに分析し、これをメモリに記憶させる。各帯域ごとに、分析された瞬間周波数と、対応する帯域の中心周波数との和の周波数で発振した正弦または余弦発振波を、分析された振幅値で振幅変調する。このようにして各帯域ごとに発生した信号を全部混合することによって元のオーディオ波形信号を再現する。
【０００３】
このような位相ボコーダを使用して、元のオーディオ波形信号の時間軸を圧縮及び伸長することが提案されている。例えば、上記振幅値情報と瞬間周波数情報を時間軸上で圧縮または伸長し、この圧縮または伸長された振幅値情報と瞬間周波数情報とを用いて、元のオーディオ波形信号を再合成することによって、元のオーディオ波形信号を時間軸上で圧縮または伸長することが提案されている。
【０００４】
図１５には、公知の位相ボコーダの一例のブロック図が示されている。また、図１６には、図１５の位相ボコーダにおけるバンドｋ（図１６の例では、ｋは０〜９９の整数である）に関する分析部（バンドｋ分析部）の詳細な構成が示されている。以下、図１５及び図１６を参照しながら、位相ボコーダについて詳細に説明する。
【０００５】
位相ボコーダは、音声信号、即ち、波形を当該音声信号の概略基本周波数帯を帯域幅に持つ複数の周波数帯域（Ｂａｎｄ；バンド）に分割する（図１５のボコーダでは、図１７に示すようにバンド０〜バンド９９の１００の周波数帯域に分割している）。分割された各周波数帯域の音声信号は、各周波数帯域の分析部に供給され、各分割周波数帯域の中心の複素周波数で乗算され、振幅値と瞬間周波数に分析展開される。
【０００６】
図１６に示す分析部においてｗ（ｎ）は、分析フィルタのインパルス応答である。各分析部の作用は、周知のｗ（ｎ）の窓で切り出す短区間フーリエ変換と同等のものである。
【０００７】
各分析部で得られた振幅値と瞬間周波数とはメモリに記憶される。
【０００８】
メモリに記憶された各周波数帯域の音声信号は、合成部（図１５に示す時間周波数変換処理部、加算器及び余弦発振器）において合成される。各合成部では、分析された振幅値と瞬間周波数とによって、当該分析された各周波数帯域の中心周波数の正弦波を変調し、当該各周波数帯域の音声信号を生成し、生成した各周波数帯域の音声信号を混合すると、元の音声信号を復元できる。
【０００９】
ここで、音声信号の再生時間を圧縮伸張する場合には、時間周波数変換処理部において、振幅値の補間値と、瞬間周波数の補間値とを求める時間周波数変換処理が行われる。
【００１０】
図１８（ａ）には、バンドｋに関する時間周波数変換処理を実行するためのバンドｋの時間周波数変換処理部の構成が示されている。この図１８（ａ）を参照しながら、音声信号の再生時間を圧縮伸張する場合について説明する。
【００１１】
先ず、音声信号の再生時間を伸張する場合には、時間周波数変換処理部において、各サンプル点での振幅値を補間して、振幅値のエンベロープを時間伸縮情報に基づいて引き延ばし、また、瞬間周波数も時間伸縮情報に基づいてサンプル点の補間値を求める（図１８（ｂ）参照）。このように補間によって得られた振幅値と瞬間周波数とから、上述したのと同様に、分割された各周波数帯域の音声信号を求め、混合する。
【００１２】
一方、音声信号の再生時間を圧縮する場合には、振幅値と瞬間周波数とを時間伸縮情報に基づく補間によって間引いて、エンベロープを縮める（図１８（ｃ）参照）。このように補間によって得られた振幅値と瞬間周波数とから、上述したのと同様に、分割された各周波数帯域の音声信号を求め、混合する。
【００１３】
なお、音声信号のピッチを変化させる場合には、分割された各周波数帯域の中心周波数と瞬間周波数との和を周波数変換情報（変化比率）で乗算し、上述した補間演算を実行すればよい。
【００１４】
上述した処理は、公知の手法により実行されるものであるので、フローチャート並びにその小差な説明は省略する。
【００１５】
このような圧縮及び伸長では、オーディオ波形信号全体を時間軸上で一様に圧縮または伸長している。従って、元のオーディオ波形信号が、例えば楽音信号であるとすると、それのアタック部も圧縮伸長されることになる。この場合、楽音としては非常に不自然なものとなることがあった。
【００１６】
これを防止するために、各帯域の振幅値情報と瞬間周波数情報自体の圧縮伸長は行わずに、通常通りに振幅値情報と瞬間周波数情報に基づいて波形再生をおこない、時間軸圧縮する場合には、オーディオ波形信号が音声の場合、或る音節に対して、設定された圧縮時間が経過した段階で、次の音節の波形再合成に移行し、伸長する場合には、各帯域ごとに振幅値情報と瞬間周波数情報との繰り返し期間を予め定めておき、振幅値情報と瞬間周波数情報との或る音節の繰り返し期間の読み出しが終了した後、その繰り返し期間の振幅値と瞬間周波数情報とを繰り返し使用して、波形再合成を行い、設定された伸長時間が経過するまで或る音節の波形再合成を行うことが考えられる。
【００１７】
【発明が解決しようとする課題】
しかし、このような繰り返し再生を行う場合、各帯域での繰り返し期間の周期や、繰り返し期間の位置が一致するとは限らず、次の音節に進むタイミングが各帯域において一致せず、再合成音が不自然になる。また、瞬間周波数情報を使用する位相ボコーダの構成上、圧縮も伸張も行わずに元のオーディオ波形信号を再合成しようとしても、位相情報を有していないので、元のオーディオ信号を再合成することができなかった。特に立ち上がりが急なオーディオ信号では、その違いが明確に現れることがあった。
【００１８】
本発明は、位相ボコーダを利用してオーディオ波形信号の時間軸圧縮伸長を行う場合で、かつ各帯域の波形データの再生位置に誤差が生じるような再生を行う場合に、波形データの所定位置での誤差を補正するようにしたことや、圧縮も伸張も行わない場合に、元のオーディオ波形信号を再合成できるようにしたことによって、合成波形の音質を改善する波形再生装置を提供することを目的とする。
【００１９】
【課題を解決するための手段】
本発明の波形再生装置は、複数の区間からなる元のオーディオ波形信号の前記区間の境界となる時間軸上の位置を示す１つ以上のマーク情報と、前記元のオーディオ波形信号を複数の周波数帯域に分割し、これら分割帯域ごとの波形信号の位相情報及び振幅値情報とを、記憶する波形データ記憶手段を有している。各帯域の位相情報及び振幅値情報は、元のオーディオ波形信号を適切なサンプリング周波数でサンプリングして得た各サンプリング値ごとに取得されている。即ち、元のオーディオ波形信号の時間軸上の位置に対応して取得されている。前記元のオーディオ波形信号の時間軸上の位置、例えば上記の各サンプリング値の位置を表わし、所望の速度で時間的に変化する第１の再生位置情報を発生する第１の再生位置情報発生手段が設けられている。この速度は、時間圧縮する場合には元のオーディオ波形信号を時間軸の圧縮伸長を行わない場合よりも速くなり、時間伸長する場合には遅くなる。前記各帯域の位相情報及び振幅値情報の時間軸上の位置を表わし、前記所望の速度と異なる速度で時間的に変化する第２の再生位置情報を発生する第２の再生位置情報発生手段も設けられている。第２の再生位置情報の速度は、一定の速度とすることができ、例えば、元のオーディオ波形信号の時間軸の圧縮伸長を行わない場合の速度とすることができる。第２の再生位置情報に従って、前記波形データ記憶手段から前記位相情報及び振幅値情報を読み出し、その読み出された位相情報及び振幅値情報に従って再生オーディオ波形信号を合成する波形信号合成手段が設けられている。第２の再生位置情報は、各帯域毎にそれぞれ独立して設けるようにしている。前記第１の再生位置情報が表わす位置が、前記波形データ記憶手段の前記マーク情報の表わす位置に達する前に、前記第２の再生位置情報が前記ループ区間の終端に達した分割帯域についてそのループ区間を繰り返し読み出すように前記第２の再生位置情報を制御し、前記第１の再生位置情報が表わす位置が、前記波形データ記憶手段の前記マーク情報の表わす位置に達したとき、このマーク情報の表わす位置に対応する第２の再生位置情報に変更するように前記第２の再生位置情報を制御する制御手段が、設けられている。
【００２０】
また、波形信号合成手段は、前記波形データ記憶手段から読み出された位相情報を周波数情報に変換する変換手段と、該変換された周波数情報を入力して、該周波数情報に対応した周波数の周期信号を発生すると共に、前記読み出された位相情報に従って発生される周期信号の位相が、前記第２の再生位置情報の変更に対応して変更制御される周期信号発生手段と、該周期信号発生手段から発生される周期信号の振幅を前記読み出された振幅情報に対応して制御する振幅制御手段とを、具備するものとできる。
【００２１】
本発明の波形再生装置では、第１の再生位置情報は、時間軸の圧縮を行う場合には、元の波形信号をそのまま再生する場合よりも速い速度で変化し、伸長を行う場合には、遅い速度で変化する。一方、第２の再生位置情報は、例えば第１の再生位置情報と異なる速度で各帯域の振幅値情報及び位相情報を読み出しているので、第１の再生位置情報が指定している時間軸上の位置と、第２の再生位置情報が指定している振幅値情報及び位相情報に対応する元のオーディオ波形信号の時間軸上の位置とには、ずれ（偏差）が生じる。そして、第１の再生位置情報がマーク情報が表わす位置に到達したとき、即ち、所定の時間軸圧縮または伸長が行われた時点で、第１の再生位置情報と第２の再生位置情報との偏差が零になるように、制御手段が第２の再生位置情報を制御しているので、第２の再生位置情報は、マーク情報に対応する各帯域の位相情報と振幅値情報とに対応する位置に修正され、以後の波形再生は、マーク情報に対応する位置以降の各帯域の位相情報と振幅値情報とに基づいて行われる。
【００２２】
なお、時間軸の伸長を行う場合、第１の再生位置情報が、まだマーク情報の位置に到達する前に、第２の位置情報によって、マーク情報の位置に対応する位相情報及び振幅値情報まで読み出してしまうことがある。そこで、マーク情報の位置に対応する位相情報及び振幅値情報の記憶されている位置よりも前に第２の再生位置情報に対して繰り返し期間を定めておき、第１の再生位置情報がマーク情報の位置に到達する前には、この繰り返し期間の位相情報と振幅値情報とを繰り返し読み出すことを行うことができる。この場合、第１の再生位置情報がマーク情報の位置に到達すると、即座に第２の再生位置情報がマーク情報に対応する位置に修正される。また、時間軸の圧縮を行う場合、第１の再生位置情報がマーク情報の位置に到達したとき、第２の再生位置情報がマーク情報の位置に対応する位相情報及び振幅値情報の記憶位置を指定していないことがある。この場合にも、直ちに第２の再生位置情報は、マーク情報の位置に対応する位相情報と振幅値情報の記憶位置に修正される。
【００２３】
【発明の実施の形態】
本発明の１実施の形態の波形再生装置は、図１に示すように、波形メモリ２、ＤＳＰ４とを有し、これらが位相ボコーダとして機能する。この位相ボコーダは、波形メモリ２に記憶されている波形データに基づいて再生オーディオ波形信号を合成して、Ｄ／Ａ変換器５に供給する。ＤＳＰ４は、ＲＯＭ８に記憶されたプログラムに従って動作する。このプログラムは、ＣＰＵ６を介してＤＳＰ４に転送される。
【００２４】
ＣＰＵ６は、ＲＯＭ８に記憶されているプログラムに従って、操作子１０の操作状態を検出し、検出結果に従ってＤＳＰ４を制御する。検出した操作子１０の操作状態や、操作子１０の操作状態に基づくＤＳＰ４の制御状態を、表示装置１２にＣＰＵ６は表示する。操作子１０には、再生モードを設定する再生モードスイッチ、圧縮または伸長とその程度を表すパラメータを設定する圧縮伸長操作子、複数の鍵を備えた鍵盤等が、設けられている。また、ＲＡＭ１４は、ＣＰＵ６のワーキングメモリとして使用され、後述する各種レジスタやフラグが設定される。
【００２５】
波形メモリ２には、前述の位相ボコーダとは異なり、振幅値情報と位相情報とが、記憶されている。これら情報は、図１１及び、図１１に示した各分析部の詳細を示した図１２から明らかなように、或る波形データ信号、例えば楽音の１フレーズを所定のサンプリング周波数でサンプリングした各サンプリング値ｘ（ｎ）を、この波形データ信号の概略基本周期を帯域幅に持つ複数、例えばｍ個の周波数帯域に分割し、各帯域の信号をそれぞれの帯域の中心の複素周波数で乗算して、振幅情報と位相情報に分析展開したものである。
【００２６】
ＤＳＰ４は、図２に示すように、各帯域ごとにそれぞれ設けられた変換手段、例えば時間周波数変換処理部４ａ−１乃至４ａ−ｍと、周期信号発生手段、例えば余弦発振器４ｂ−１乃至４ｂ−ｍと、振幅制御手段、例えば乗算器４ｃ−１乃至４ｃ−ｍを備える波形信号合成手段として機能する。さらに、ＤＳＰ４は、各乗算器４ｃ−１乃至４ｃ−ｍの各出力を加算する加算器４ｄと、この加算器４ｄの加算出力にゲート信号を乗算する乗算器４ｅとしても機能する。加算器４ｄが合成器として、乗算器４ｅがゲートとして機能する。
【００２７】
各時間周波数変換処理部４ａ−１乃至４ａ−ｍは、後述の制御手段、例えば制御部４ｆから供給される、各帯域の時間伸縮情報ａｄｄｒに従って振幅値情報と位相情報とを波形メモリ２から読み出し、位相情報は、微分手段によって瞬時周波数情報に変換して、それぞれ補間演算を施し、対応する帯域の中心周波数ωｋとの和の周波数で、対応する余弦発振器４ｂ−１乃至４ｂ−ｍを発振させる。なお、音高を変更する場合には、周波数変換情報である変換比率を乗算する。また、余弦発振器４ｂ−１乃至４ｂ−ｍの位相を設定するために、波形メモリ２から読み出した位相情報を対応する余弦発振器４ｂ−１乃至４ｂ−ｍに供給しており、その余弦発振器４ｂ−１乃至４ｂ−ｍは、供給された位相情報に位相が制御部４ｆからの位相リセット信号によりリセットされる。これらの処理を実施するには、時間周波数変換処理部４ａ−１乃至４ａ−ｍは、図１３のような構成で実現できる。
【００２８】
図１３において、１００は、波形メモリ２から振幅情報と位相情報とを時間伸縮情報ａｄｄｒに従って読み出すための読み出し手段、１０２は、読み出された位相情報を微分する微分手段、１０４は、微分手段から得られた瞬時周波数情報を後述する時間伸縮情報ａｄｄｒによって補間する補間手段、１０６は読み出された振幅値情報を時間伸縮情報ａｄｄｒによって補間する補間手段、１０８は補間された瞬時周波数情報と中心周波数ωｋとを加算するための加算手段、１１０は加算手段１０８からの出力に周波数変換情報を乗算する乗算手段である。
【００２９】
以上のようにして発振された余弦発振器４ｂ−１乃至４ｂ−ｍからの余弦発振波を、対応する乗算器４ｃ−１乃至４ｃ−ｍにおいて、対応する時間周波数変換処理部４ａ−１乃至４ａ−ｍからの振幅情報によって振幅変調する。このようにして各帯域ごとに発生した信号を加算器４ｄで合成することによって、元のオーディオ波形信号を再現できる。なお、ゲート４ｅの開閉状態を制御するゲート信号は、制御部４ｆに含まれているゲート信号発生部４ｇから供給される。ゲート信号は、例えば０から１の間の値をとり、立上のときには、０から直ちに１となるが、立ち下げのときには１から徐々に０に向かって減少する。
【００３０】
波形メモリ２には、上述した各振幅値情報及び位相情報を含めて、各種パラメータが記憶されている。即ち、図４（ａ）に示すように、波形メモリ２は、波形情報領域、バンド波形情報領域及びバンド波形データ領域を有している。
【００３１】
同図（ｂ）に示すように、波形情報領域には、ＷａｖｅＳｔａｒｔ、ＷａｖｅＥｎｄ、Ｍａｒｋ（１）、Ｍａｒｋ（２）、Ｍａｒｋ（３）、Ｍａｒｋ（４）が記憶されている。Ｗａｖｅｓｔａｒｔは、図３の上部に示す原オーディオ波形信号を所定の周波数でサンプリングして、波形メモリ２に記憶させたとしたなら（実際には波形メモリ２には記憶させていない）、そのスタートアドレスとなるアドレスであり、ＷａｖｅＥｎｄは、同じくエンドアドレスとなるアドレスである。
【００３２】
Ｍａｒｋ（１）乃至Ｍａｒｋ（４）は、同じく原オーディオ波形信号の各サンプリング値を記憶させたとしたなら、例えば楽音の場合、アタック部、定常部、減衰部、無音部のような各区間の境界、音声信号の場合、各音節のような各区間の境界となる部分のアドレスである。これらＭａｒｋ（１）乃至Ｍａｒｋ（４）がマーク情報に相当する。なお、この実施の形態では、原オーディオ波形信号が３つの区間からなる例を示したので、マーク情報は４つであるが、実際には原オーディオ波形信号の区間数に応じてマーク情報の数は増減する。
【００３３】
図４（ｃ）に示すように、バンド波形データ領域は、原オーディオ波形信号の概略基本周期を帯域幅に持つｍ個の周波数帯域ごとに設けられており、同図（ｄ）に示すように、各帯域ごとに、上述した振幅値情報と位相情報とが記憶されている。これら振幅値情報と位相情報とは、同図（ｅ）に示すように、原オーディオ波形信号の各サンプリング値ごとに記憶されている。
【００３４】
バンド波形情報領域も、同図（ｆ）に示すようにｍ個の帯域に対応して設けられており、各帯域ごとに同図（ｇ）に示すようにｗａｖｅ＿ｓｔａｒｔ、ｗａｖｅ＿ｅｎｄ、ｍａｒｋ（１）、ａｌｔＳ（１）、ａｌｔＥ（１）、ｍａｒｋ（２）、ａｌｔＳ（２）、ａｌｔＥ（２）、ｍａｒｋ（３）、ａｌｔＳ（３）、ａｌｔＥ（３）、ｍａｒｋ（４）が記憶されている。
【００３５】
ｗａｖｅ＿ｓｔａｒｔは、各サンプリング値の各帯域における振幅値情報と周波数情報が記憶されているアドレスのうちスタートアドレスを表し、ｗａｖｅ＿ｅｎｄは、エンドアドレスを表している。ｍａｒｋ（１）乃至ｍａｒｋ（４）は、上述した各区間の境界Ｍａｒｋ（１）乃至Ｍａｒｋ（４）に対応し、これらの位置にあるサンプリング値の振幅値情報と周波数情報とが記憶されているアドレスを表している。
【００３６】
ａｌｔＳ（１）乃至ａｌｔＳ（３）は、各区間におけるループ期間の開始点における振幅値情報と位相情報とが記憶されているアドレスを表し、ａｌｔＥ（１）乃至ａｌｔＥ（３）は、ループ期間の終了点における振幅値情報と位相情報とが記憶されているアドレスを表す。ループ期間は、後述するように時間軸の伸長を行った場合、例えばループ期間の最終アドレスａｌｔＥ（１）乃至ａｌｔＥ（３）の振幅値情報と位相情報まで読み出して、波形合成を行ってもまだ伸長が終了しない場合に、このループ期間の振幅値情報と位相情報とによって波形合成を行うためのものである。
【００３７】
なお、この実施の形態では、区間数を３としたので、ｍａｒｋの数が４、ａｌｔＳの数が３、ａｌｔＥの数が３であるが、区間数が３とは異なった場合には、これらの数も区間数に応じて変化する。また、ループ期間の位置及び長さ（周期）は、図３から明らかなように、各帯域ごとに異なっている。
【００３８】
次に、図３を参照しながら、この波形再生装置の概略の動作について、説明する。なお、本願発明の実施の形態では、波形メモリ２のバンド波形データ領域には振幅情報と位相情報とが記憶されているが、具体的な現象（ピッチの変化）を分かりやすくするために、図３では、位相情報を周波数情報に変更して示している。実際には、各帯域の記憶情報は、図１４に、或る帯域Ｂａｎｄ（ｍ）として示されているように、位相情報Ｐ（ｍ）と振幅情報Ａ（ｍ）とが記憶されている。この波形再生装置では、ＤＳＰ４において原オーディオ波形信号の各サンプリング点のアドレス（第１の再生位置情報）を発生するアドレスカウンタＳＰＨＡＳＥと、各帯域の位相情報及び振幅値情報が記憶されているアドレス（第２の再生位置情報）を発生するｍ個のアドレスカウンタａｄｄｒとを使用する。
【００３９】
ＳＰＨＡＳＥの値は、ＷａｖｅＳｔａｒｔから増加していく。その増分ｔｃｏｍｐは、操作子１０に含まれている圧縮伸長操作子によって設定された圧縮または伸長の程度に応じて設定される。例えば圧縮する場合には、増分ｔｃｏｍｐは、圧縮伸長を行わない場合の増分、例えば１よりも大きな値とされ、伸長する場合には、圧縮伸長を行わない場合の増分１よりも小さな値とされる。また、各ａｄｄｒは、ｍａｒｋ（１）から増加していくが、その増分ｒａｔｅは、ＳＰＨＡＳＥの増分ｔｃｏｍｐと異なり、例えば圧縮伸長を行わない通常の再生の場合の増分である１である。なお、増分ｒａｔｅは、操作子によって任意に設定することができるようにしてもよい。
【００４０】
各ａｄｄｒの値に従って、波形メモリ２から各帯域の位相情報及び振幅値情報が読み出され、対応する周波数時間変換処理部４ａ−１乃至４ａ−ｍに供給され、波形の再合成が行われる。
【００４１】
原オーディオ波形信号のサンプリング周期と同じ周期ごとに、ＳＰＨＡＳＥと各ａｄｄｒとが、アドレスの発生を行うと、例えば、ｔｃｏｍｐが１より大きい場合、即ち、時間圧縮の場合、ＳＰＨＡＳＥがＭａｒｋ（２）を指定したとき、各ａｄｄｒは、まだＭａｒｋ（２）に対応するｍａｒｋ（２）の位置には到達していない。この時、各ａｄｄｒとＳＰＨＡＳＥとの値には偏差が生じている。この偏差を０にするように、各ａｄｄｒの値がｍａｒｋ（２）に強制的に修正されると同時に、各余弦発振器４ｂ−１乃至４ｂ−ｍに制御部４ｆから位相リセット信号が供給されて、各ｍａｒｋ（２）におけるそれぞれの位相情報を各余弦発振器４ｂ−１乃至４ｂ−ｍに設定する。これによって、Ｍａｒｋ（２）から始まる区間について、これらに対応した各帯域の位相情報及び振幅値情報に基づいて時間圧縮されたオーディオ波形信号が再合成され、再合成音が不自然になることはない。以下、同様にＭａｒｋ（２）以降の区間についても時間軸の圧縮された波形の再合成が行われる。
【００４２】
同様にｔｃｏｍｐが１よりも小さい場合、即ち時間軸の伸長が行われた場合、仮に各ａｄｄｒがｍａｒｋ（２）に到達しても、ＳＰＨＡＳＥはＭａｒｋ（２）には到達していない。そこで、各帯域ごとにループ期間を設け、各ａｄｄｒの値が各ループ期間の終端部のアドレスａｌｔ＿Ｅ（各帯域のループ期間の終了点を総称して、ａｌｔ＿Ｅと記載する。）に到達すると、ここからａｌｔ＿Ｓ（各帯域のループ期間の始端部を総称して、ａｌｔ＿Ｓと記載する。）まで、逆方向に位相情報と振幅値情報とを読み出し、波形の再生を行う。ａｌｔ＿Ｓまで戻っても、ＳＰＨＡＳＥがＭａｒｋ（２）に到達していない場合、ａｌｔ＿Ｅに向かって順方向に位相情報と振幅値情報とを読み出し、波形の再合成を行う。
【００４３】
このようなループ期間での読み出しを行っている間に、ＳＰＨＡＳＥがＭａｒｋ（２）に到達すると、このときもＳＰＨＡＳＥの値と各ａｄｄｒの値との間には偏差が存在する。そこで、この偏差を０にするように、各ａｄｄｒの値がＭａｒｋ（２）に対応するｍａｒｋ（２）に強制的に修正されると同時に、各余弦発振器４ｂ−１乃至４ｂ−ｍに位相リセット信号が供給されて、各ｍａｒｋ（２）におけるそれぞれの位相情報を各余弦発振器４ｂ−１乃至４ｂ−ｍに設定する。これによって、Ｍａｒｋ（２）から始まる区間の時間伸長された波形の再合成が、Ｍａｒｋ（２）に対応する各帯域のｍａｒｋ（２）の位相情報及び振幅値情報に基づいて開始される。従って、Ｍａｒｋ（２）から始まる区間の開始のタイミングが、各帯域とも一致しており、再合成音が不自然になることはない。Ｍａｒｋ（２）以降の区間でも同様に時間軸の伸長が行われる。
【００４４】
このような圧縮伸長の場合、各ａｄｄｒの増分ｒａｔｅが１であるので、オーディオ波形信号の再合成は、このオーディオ波形信号の各サンプリング値をサンプリング周波数で読み出すことによって行われている。従って、例えば、元のオーディオ波形信号が楽音信号の場合、アタック部が圧縮伸長されないと共に、原波形信号のアタック部の音質を損なうことなく再合成することができ、不自然な楽音となることはない。
【００４５】
以下、ＤＳＰ４及びＣＰＵ６の動作を詳細に説明する。図５は、ＣＰＵ６のメインルーチンを示したものである。ＣＰＵ６は、主に操作子１０の状態を検出し、この検出状態に応じて各種設定を行うと共に、表示装置１２に検出状態及び設定状態を表示する。
【００４６】
まず、操作子１０に含まれている再生モードスイッチが変化しているか否かを判断する（ステップＳ２）。再生モードはモード１とモード２とがある。モード２は、上述したような原オーディオ波形信号の新たな区間の再合成を開始するとき、各帯域の周波数情報及び振幅値情報のタイミングを一致させるモードであり、モード１は、このようなタイミングの修正を行わないモードである。
【００４７】
再生モードスイッチに変化があると判断されると、変化した再生モードスイッチはモード１であるかモード２であるかを判断する（ステップＳ４）。再生モードスイッチがモード１であると、再生モードを１に設定し（ステップＳ６）、再生モードスイッチがモード２であると、再生モードを２に設定し（ステップＳ８）、表示装置１２における再生モードの表示を、設定されたモードに変更する（ステップＳ１０）。
【００４８】
ステップＳ１０に続いて、或いはステップＳ２において再生モードスイッチに変化がないと判断されたとき、操作子１０に含まれている圧縮伸長操作子が変化しているか判断する（ステップＳ１２）。圧縮伸長操作子に変化があると判断されると、圧縮伸長操作子の操作量に対応して圧縮伸長パラメータ値ｔｃｏｍｐを設定する（ステップＳ１４）。次に、表示装置１２におけるｔｃｏｍｐの値の表示を更新する（ステップＳ１６）。
【００４９】
ステップＳ１６に続いて、或いはステップＳ１２において圧縮伸長操作子に変化が無いと判断されると、操作子１０に含まれている鍵盤の鍵操作子が操作されたか否かを判断する（ステップＳ１８）。この鍵操作子が変化していると判断されると、鍵操作子の変化がオフからオンか（発音の開始か）或いはオンからオフか（発音の停止か）或いはオンからオンか（レガート演奏か）の判断が行われる（ステップＳ２０）。オフからオンの変化であると判断されると、その押された鍵に対応したＭＩＤＩデータであるノート番号から周波数変換情報を算出して、その周波数変換情報をＤＳＰ４に転送する（ステップＳ２１）。次に、ＤＳＰ発音開始処理を実行するように、ＤＳＰ４に指示を送る（ステップＳ２２）。前記ステップＳ２０においてオンからオンの変化であると判断されると、それは鍵をレガート演奏したことであり、新たな押鍵に対応したＭＩＤＩデータであるノート番号から新たな周波数変換情報を算出して、その周波数変換情報をＤＳＰ４に転送する（ステップＳ２３）。同様にオンからオフの変化であると判断されると、ＤＳＰ発音停止処理を実行するように、ＤＳＰ４に指示を送る（ステップＳ２４）。これによって、メインルーチンが終了する。なお、メインルーチンは、所定周期ごとに繰返し実行される。
【００５０】
ＤＳＰ発音開始処理では、初期設定を行う（ステップＳ２６）。即ち、図６に示すように、原オーディオ波形信号の各サンプル値のアドレスを指定するアドレスカウンタＳＰＨＡＳＥの値を、スタートアドレスであるＷａｖｅＳｔａｒｔに設定する。次に、現在ＳＰＨＡＳＥが表しているアドレスが属する区間の先頭のマーク情報を記憶するレジスタｍｋの値をＭａｒｋ（１）にする。そして、初期設定以後にはアドレスカウンタＳＰＨＡＳＥが現在発生しているアドレスが属する区間の１つ前の区間の先頭を表すマーク情報を記憶するレジスタｍｋ＿ｏｌｄの値をＭａｒｋ（１）とする。次に各マーク情報を指定するためのカウンタｎの値を１とする。発音が開始されているか否かを表すフラグｋｅｙ＿ｏｎを１に設定し、発音が開始されていることを示す。そして、ループ期間の振幅値情報と周波数情報を読み出すときの方向を指定するフラグｄｉｒを順方向に設定する。
【００５１】
そして、ＤＳＰ２の割込みを許可する（ステップＳ２８）。これによって、ＤＳＰ発音開始処理ルーチンは終了する。
【００５２】
ＤＳＰ発音停止処理ルーチンは、図７に示すように、発音を停止させるために、ｋｅｙ＿ｏｎを０に設定する（ステップＳ３０）。次に、ゲート信号発生部４ｆにゲート信号の立ち下げを指示する（ステップＳ３２）。これによって、ゲート４ｅの出力が徐々に小さくなり、消音される。
【００５３】
ＤＳＰの割り込み処理ルーチンは、原オーディオ波形信号のサンプリング周波数に等しい周波数のクロック信号が、図示しないクロック発生器によって発生されるごとに実行される。このルーチンでは、図８に示すように、まず制御メイン処理を実行する（ステップＳ３４）。制御メイン処理については、後述する。
【００５４】
次に各帯域を指定するためのカウンタｊの値を１とし（ステップＳ３６）、後述するように、このカウンタｊによって指定された帯域の時間軸変換処理を行う（ステップＳ３８）。この時間軸変換処理についても後述する。カウンタｊの値がｍより大きいか判断し（ステップＳ４０）、ｍより大きければ、割込み処理を終了し、ｍより大きくなければ、カウンタｊの値を１つ増加させ、ステップＳ３８を実行する。そして、後述のフラグｓｙｌｌ＿ｆｌｇが１であるか判断する（ステップＳ３９）。フラグｓｙｌｌ＿ｆｌｇが１の場合には、余弦発振器４ａ−１乃至４ａ−ｍに位相リセット信号を供給し、位相をリセットするように指示し、ＤＳＰ４の割り込み処理ルーチンを終了する（ステップＳ４１）。なお、ステップＳ３９において、フラグｓｙｌｌ＿ｆｌｇが１でないと判断されると、ＤＳＰ４の割り込み処理ルーチンを終了する。
【００５５】
ＤＳＰ制御メイン処理は、クロック信号が発生するごとに、主にＳＰＨＡＳＥの値をｔｃｏｍｐずつ進めるためのものである。図９に示すように、ＤＳＰ制御メイン処理では、アドレスカウンタＳＰＨＡＳＥがＭａｒｋ（１）乃至（４）のいずれかに到達したとき、即ち各区間の先頭に到達したときのみに、１とされるフラグｓｙｌｌ＿ｆｌａｇが０以外であるか、まず判断する（ステップＳ４２）。０以外であれば、これを０とする（ステップＳ４４）。これに続いて、またはステップＳ４２においてｓｙｌｌ＿ｆｌａｇが０以外でないと、即ち０であると判断された場合、アドレスカウンタＳＰＨＡＳＥの値が、ｍｋ以上であるか判断する（ステップＳ４６）。当初、ＤＳＰ発音開始処理において、ＳＰＨＡＳＥをＷａｖｅＳｔａｒｔに、ｍｋをＭａｒｋ（１）に設定しているので、始めてステップＳ４６が実行されたときには、アドレスカウンタＳＰＨＡＳＥの値が、ｍｋ以上であると判断される。
【００５６】
アドレスカウンタＳＰＨＡＳＥの値が、ｍｋ以上であると、ＳＰＨＡＳＥがＷａｖｅＥｎｄ以上であるか、即ち区間の最終アドレスを指定したか判断する。最終アドレスを指定していない場合、ｓｙｌｌ＿ｆｌａｇを１とし、カウンタｎの値を１つ増加させる（ステップＳ５０）。即ち、ＳＰＨＡＳＥが或る区間の先頭に位置していることを表す。
【００５７】
次に、ｍｋをＭａｒｋ（ｎ）に更新する（ステップＳ５２）。これによって、ｍｋの値は、次の区間の先頭を表す。ゲート信号発生部４ｇにゲート信号の立上を指示する（ステップＳ５４）。これによって、ゲート４ｅから再生波形の出力が可能となる。
【００５８】
そして、ｋｅｙ＿ｏｎが０であるか判断する（ステップＳ５６）。ｋｅｙ＿ｏｎは上述したＤＳＰ発音停止処理が実行されたときのみに０となる。ｋｅｙ＿ｏｎが０でないと、アドレスカウンタＳＰＨＡＳＥの値をｔｃｏｍｐだけ増加させ（ステップＳ５８）、この制御メイン処理を終了する。
【００５９】
ｋｅｙ＿ｏｎが０であると、ゲート信号が０になっているか判断する（ステップＳ６０）。ゲート信号は立ち下げを指示された後、徐々に低下していくので、実際にゲート信号が０となってゲート４ｅから出力が発生していないかどうか判断している。ゲート信号が０でないと判断されると、ステップＳ５８が実行される。ゲート信号が０であると、ＤＳＰの割り込み処理を停止させ（ステップＳ６２）、リターンする。従って、以後、再び割込みが許可されるまで、クロック信号が発生しても、割込み処理は実行されない。
【００６０】
ステップＳ４８において、ＳＰＨＡＳＥがＷａｖｅＥｎｄ以上であると判断されると、直ちにステップＳ６２が実行される。
【００６１】
ステップＳ４６において、ＳＰＨＡＳＥがｍｋ以上でないと判断されると、即ち次の区間の先頭に到達していないと判断されると、ＳＰＨＡＳＥの値がｍｋ−α以上か判断される。即ち、次の区間の先頭から予め定めたアドレス距離αだけ手前のアドレスにＳＰＨＡＳＥが到達しているか判断する（ステップＳ６４）。αは、各区間の終了間際に、再生合成音の消音を開始する位置を決定するためのものである。ＳＰＨＡＳＥの値がｍｋ−α以上でないと判断されると、ステップＳ５６、Ｓ６０、Ｓ５８またはＳ５６、Ｓ６０、Ｓ６２が実行される。
【００６２】
ＳＰＨＡＳＥの値がｍｋ−α以上であると、ｍｋの値がｍｋ＿ｏｌｄに等しくないか判断する（ステップＳ６６）。或る区間の先頭のアドレスをＳＰＨＡＳＥが指定したとき、ステップＳ５２によってｍｋは次の区間の先頭のアドレスに更新されているので、初めてステップＳ６６が実行されるときには、ｍｋはｍｋ＿ｏｌｄと不一致である。従って、ゲート信号発生部４ｇにゲート信号の立ち下げを指示する（ステップＳ６８）。そしてｍｋ＿ｏｌｄをｍｋの値に更新し（ステップＳ７０）、ステップＳ５６以降を実行する。従って、二度目以降にステップＳ６６が実行される場合、ステップＳ６６から直ちにステップＳ５６以降が実行される。
【００６３】
このようにＤＳＰ制御メインルーチンでは、クロック信号が発生するごとにＳＰＨＡＳＥをｔｃｏｍｐずつ増加させ、ＳＰＨＡＳＥが各区間の先頭のアドレスを指定したときだけ、ｓｙｌｌ＿ｆｌａｇを１とし、ゲート４ｅから出力可能とし、ＳＰＨＡＳＥがｍｋ−α以上のアドレスを初めて指定したとき、ゲート信号の立ち下げを開始している。
【００６４】
図１０に示すように、ＤＳＰ時間軸変換処理では、設定されている再生モードが１であるか２であるかを判断する（ステップＳ７２）。再生モード１である場合には、ｊカウンタによって指定された帯域のａｄｄｒをｔｃｏｍｐだけ増加させ（ステップＳ７４）、ａｄｄｒを時間伸縮情報として各時間周波数変換処理部に転送する（ステップＳ７６）。
【００６５】
再生モード２が設定されている場合には、ｓｙｌｌ＿ｆｌａｇが１であるか、即ち、いずれかの区間の先頭のアドレスが指定されているか判断する（ステップＳ７８）。いずれかの区間の先頭のアドレスが指定されている場合、その区間の先頭のアドレスをａｄｄｒに設定するため、ａｄｄｒをｍａｒｋ（ｎ−１）に更新する（ステップＳ８０）。
【００６６】
これは、ＤＳＰ制御メイン処理のステップＳ５０においてｓｙｌｌ＿ｆｌａｇを１としたときに（ＳＰＨＡＳＥが原オーディオ波形信号のいずれかの区間の先頭に到達したとき）、ｎを１進めているので、ａｄｄｒをｍａｒｋ（ｎ）とすると、ＳＰＨＡＳＥが指定している区間の次の区間の先頭の振幅値情報と位相情報とが記憶されているアドレスを指定することになる。従って、ｍａｒｋ（ｎ−１）として、ＳＰＨＡＳＥが指定している原オーディオ波形信号の区間の先頭に振幅値情報と位相情報とが記憶されているアドレスを指定させている。
【００６７】
このようにＳＰＨＡＳＥが或る区間の先頭を指定したとき、ａｄｄｒもその区間の先頭の振幅値情報と位相情報とが記憶されているアドレスに強制的に修正されているので、時間軸の圧縮、伸長いずれが行われている場合でも、新しい区間の再生合成は、その区間の先頭のサンプリング値の各帯域の位相情報と振幅値情報とによって開始される。
【００６８】
次に、ループ期間の先頭のアドレスを記憶するａｌｔ＿ｓｔａｒｔの値をｊカウンタが指定する帯域のａｌｔＳ（ｎ−１）に、ループ期間の終了端のアドレスを記憶するａｌｔ＿ｅｎｄの値をｊカウンタが指定する帯域のａｌｔＥ（ｎ−１）に更新する（ステップＳ８２）。これもａｄｄｒが指定するアドレスが属する区間のループ期間の先頭と終了点とを記憶するためである。次に、ｄｉｒを順方向に設定し（ステップＳ８４）、ステップＳ７６を実行する。
【００６９】
ステップＳ７８において、ｓｙｌｌ＿ｆｌａｇが１でないと判断されると、ＳＰＨＡＳＥは区間の先頭を指定していないので、ｄｉｒが順方向であるか判断する（ステップＳ８６）。順方向の場合、ａｄｄｒを予め定めたｒａｔｅだけ増加させる（ステップＳ８８）。ｒａｔｅとしては、上述したように例えば１が使用される。従って、ステップＳ７６で転送されたａｄｄｒに従って、時間周波数変換処理部での振幅情報と位相情報との読み出しは、通常の速度で行われる。
【００７０】
ａｄｄｒがａｌｔ＿ｅｎｄ以上であるか判断され（ステップＳ９０）、ａｌｔ＿ｅｎｄ以上でなければ、ステップＳ７６が実行される。ａｄｄｒがａｌｔ＿ｅｎｄ以上であれば、ループ期間の終了点を超えたアドレスを指定しているので、ａｄｄｒの値を、現在のａｄｄｒの値とａｌｔ＿ｅｎｄとの偏差（ａｄｄｒ−ａｌｔ＿ｅｎｄ）だけａｌｔ＿ｅｎｄから減算した値に修正する（ステップＳ９２）。次に、ｄｉｒを逆方向に設定し（ステップＳ９４）、ステップＳ７６を実行する。
【００７１】
ステップＳ９２の演算は、ステップＳ９４において逆方向にａｄｄｒの読み出し方向を変更するが、このとき順方向のときに、ａｌｔ＿ｅｎｄを超えてａｄｄｒが進んだ量だけ、ａｌｅ＿ｅｎｄから戻したアドレスから逆方向では、周波数情報と振幅値情報を読み出すためである。
【００７２】
ステップＳ８６において、逆方向であると判断されると、ａｄｄｒをｒａｔｅだけ減少させる（ステップＳ９６）。ａｄｄｒがａｌｔ＿ｓｔａｒｔ以下であるか判断され（ステップＳ９８）、ａｌｔ＿ｓｔａｒｔ以下でなければ、ステップＳ７６が実行される。ａｄｄｒがａｌｔ＿ｓｔａｒｔ以下であれば、ループ期間の先頭を超えたアドレスを指定しているので、ａｄｄｒの値を、現在のａｄｄｒの値とａｌｔ＿ｓｔａｒｔとの偏差（ａｄｄｒ＋ａｌｔ＿ｓｔａｒｔ）だけａｌｔ＿ｓｔａｒｔを増加させた値に修正する（ステップＳ１００）。次に、ｄｉｒを順方向に設定し（ステップＳ１０２）、ステップＳ７６を実行する。
【００７３】
ステップＳ１００の演算は、ステップＳ１０２において順方向にａｄｄｒの読み出し方向を変更するが、逆方向においてａｌｔ＿ｓｔａｒｔを超えてａｄｄｒが進んだ量だけ、順方向ではａｌｅ＿ｓｔａｒｔから戻したアドレスから周波数情報と振幅値情報を読み出すためである。
【００７４】
このようにして、波形の再生が行われるので、原オーディオ波形信号の時間軸の圧縮伸長を行う場合に、原オーディオ波形信号の各区間の先頭部分を再合成するとき、当該区間の先頭部分の位相情報及び振幅値情報に基づいて波形再生が行われるので、合成波形の音質を改善できる。
【００７５】
上記の実施の形態では、再生モード１と再生モード２とを設けたが、場合によっては、再生モード１は不要である。再生モード１を設けない場合、波形メモリ２には、各区間におけるαの期間について振幅値情報及び位相情報を記憶させる必要はない。
【００７６】
【発明の効果】
以上のように、本発明によれば、第１の再生位置情報がマーク情報の位置に到達したとき、第２の再生位置情報を第１の再生位置情報に修正しているので、マーク情報のオーディオ波形信号の時間軸の圧縮伸長が、マーク情報に対応する位相情報及び振幅値情報によって行われ、これ以降のオーディオ波形信号の時間軸の圧縮伸長も同様に対応する位相情報及び振幅値情報によって行われるので、音質が改善される。
【図面の簡単な説明】
【図１】本発明の第１の実施の形態の波形再生装置のブロック図である。
【図２】図１の波形再生装置におけるＤＳＰが実現する位相ボコーダのブロック図である。
【図３】図１の波形再生装置の波形メモリ２に記憶されている各帯域の周波数情報と振幅値情報とこれらを発生する元になるオーディオ波形信号を示す図である。
【図４】波形メモリ２に記憶されている各種データを示す図である。
【図５】図１のＣＰＵが実行するメインルーチンのフローチャートである。
【図６】ＤＳＰが実行するＤＳＰ発音開始処理ルーチンのフローチャートである。
【図７】ＤＳＰが実行するＤＳＰ発音停止処理ルーチンのフローチャートである。
【図８】ＤＳＰが実行する割込み処理ルーチンのフローチャートである。
【図９】図８の割り込み処理ルーチン中の制御メイン処理のフローチャートである。
【図１０】図８の割り込み処理ルーチン中の時間軸変換処理のフローチャートである。
【図１１】図１の波形再生装置において波形メモリに振幅値情報と位相情報とを記憶するための構成のブロック図である。
【図１２】図１１における分析部の詳細なブロック図である。
【図１３】図１の波形再生装置における時間周波数変換処理部の詳細なブロック図である。
【図１４】図１の波形再生装置において波形メモリにバンド波形データとして記憶されている位相情報と振幅値情報とを示す図である。
【図１５】従来の位相ボコーダのブロック図である。
【図１６】図１５の分析部の詳細なブロック図である。
【図１７】図１５の位相ボコーダによって行われる波形分析を示す図である。
【図１８】図１５の時間周波数変換処理部の詳細なブロック図及び図１５の時間周波数変換処理部における時間伸張、圧縮の説明図である。
【符号の説明】
２波形メモリ（波形データ記憶手段）
４ＤＳＰ（第１の再生位置情報発生手段と、第２の再生位置情報発生手段、波形信号合成手段、制御手段）。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a waveform reproducing apparatus that performs compression and expansion on a time axis using a so-called phase vocoder.
[0002]
[Prior art]
In general, in a phase vocoder, an audio waveform signal is divided into a plurality of frequency bands having an approximate fundamental period of the signal as a bandwidth, and the signal of each band is multiplied by a complex frequency at the center of each band to obtain an amplitude value. And the instantaneous frequency are stored in the memory. For each band, a sine or cosine oscillation wave oscillated at the sum of the analyzed instantaneous frequency and the center frequency of the corresponding band is amplitude-modulated with the analyzed amplitude value. Thus, the original audio waveform signal is reproduced by mixing all signals generated for each band.
[0003]
It has been proposed to compress and expand the time axis of the original audio waveform signal using such a phase vocoder. For example, by compressing or expanding the amplitude value information and the instantaneous frequency information on the time axis, and recombining the original audio waveform signal using the compressed or expanded amplitude value information and the instantaneous frequency information, It has been proposed to compress or expand the original audio waveform signal on the time axis.
[0004]
FIG. 15 shows a block diagram of an example of a known phase vocoder. FIG. 16 shows a detailed configuration of an analysis unit (band k analysis unit) related to band k (in the example of FIG. 16, k is an integer from 0 to 99) in the phase vocoder of FIG. . Hereinafter, the phase vocoder will be described in detail with reference to FIGS. 15 and 16.
[0005]
The phase vocoder divides an audio signal, that is, a waveform into a plurality of frequency bands (bands) having a bandwidth of the approximate fundamental frequency band of the audio signal (in the vocoder of FIG. 15, the band is shown in FIG. 17). 0 to 100 frequency bands of band 99). The divided audio signal of each frequency band is supplied to the analysis unit of each frequency band, multiplied by the complex frequency at the center of each divided frequency band, and analyzed and developed into an amplitude value and an instantaneous frequency.
[0006]
In the analysis unit shown in FIG. 16, w (n) is an impulse response of the analysis filter. The operation of each analysis unit is equivalent to a short-section Fourier transform cut out by a known w (n) window.
[0007]
The amplitude value and the instantaneous frequency obtained by each analysis unit are stored in the memory.
[0008]
The audio signals in each frequency band stored in the memory are synthesized in a synthesis unit (time frequency conversion processing unit, adder and cosine oscillator shown in FIG. 15). Each synthesis unit modulates the sine wave of the center frequency of each analyzed frequency band according to the analyzed amplitude value and instantaneous frequency, generates an audio signal of each frequency band, When the audio signals are mixed, the original audio signal can be restored.
[0009]
Here, when compressing and expanding the reproduction time of the audio signal, the time-frequency conversion processing unit obtains the interpolation value of the amplitude value and the interpolation value of the instantaneous frequency in the time-frequency conversion processing unit.
[0010]
FIG. 18A shows the configuration of a time-frequency conversion processing unit for band k for performing time-frequency conversion processing for band k. A case where the reproduction time of the audio signal is compressed and expanded will be described with reference to FIG.
[0011]
First, when extending the reproduction time of an audio signal, the time-frequency conversion processing unit interpolates the amplitude value at each sample point, and extends the envelope of the amplitude value based on the time expansion / contraction information. Also, the interpolated value of the sample point is obtained based on the time expansion / contraction information (see FIG. 18B). From the amplitude value and the instantaneous frequency obtained by the interpolation in this way, the audio signals of the divided frequency bands are obtained and mixed in the same manner as described above.
[0012]
On the other hand, when the reproduction time of the audio signal is compressed, the amplitude value and the instantaneous frequency are thinned out by interpolation based on the time expansion / contraction information, and the envelope is reduced (see FIG. 18C). From the amplitude value and the instantaneous frequency obtained by the interpolation in this way, the audio signals of the divided frequency bands are obtained and mixed in the same manner as described above.
[0013]
When the pitch of the audio signal is changed, the sum of the center frequency and the instantaneous frequency of each divided frequency band is multiplied by the frequency conversion information (change ratio), and the above-described interpolation calculation is executed.
[0014]
Since the above-described processing is executed by a known method, a flowchart and a small explanation thereof are omitted.
[0015]
In such compression and expansion, the entire audio waveform signal is uniformly compressed or expanded on the time axis. Therefore, if the original audio waveform signal is, for example, a musical tone signal, the attack portion thereof is also compressed and expanded. In this case, the musical tone may be very unnatural.
[0016]
In order to prevent this, when compressing the time axis by performing waveform reproduction based on the amplitude value information and instantaneous frequency information as usual, without compressing and expanding the amplitude value information and instantaneous frequency information itself of each band. When the audio waveform signal is speech, when the set compression time has elapsed for a certain syllable, the process proceeds to waveform resynthesis of the next syllable, and when it is expanded, the amplitude for each band The repetition period between the value information and the instantaneous frequency information is determined in advance, and after reading out the repetition period of a syllable between the amplitude value information and the instantaneous frequency information, the amplitude value and the instantaneous frequency information of the repetition period are obtained. It is conceivable to re-synthesize a waveform by repeatedly using it, and re-synthesize a waveform of a certain syllable until a set expansion time elapses.
[0017]
[Problems to be solved by the invention]
However, when such repetitive playback is performed, the period of the repetition period in each band and the position of the repetition period do not always match, and the timing to advance to the next syllable does not match in each band, Become unnatural. Also, due to the configuration of the phase vocoder that uses instantaneous frequency information, even if an attempt is made to re-synthesize the original audio waveform signal without compression or expansion, the original audio signal is re-synthesized because it does not have phase information. I couldn't. In particular, the difference may appear clearly in an audio signal with a sharp rise.
[0018]
In the present invention, when performing time-axis compression / expansion of an audio waveform signal using a phase vocoder and performing reproduction such that an error occurs in the reproduction position of the waveform data of each band, To provide a waveform reproduction device that improves the sound quality of the synthesized waveform by correcting the error in the original, and by re-synthesizing the original audio waveform signal when neither compression nor expansion is performed. Objective.
[0019]
[Means for Solving the Problems]
The waveform reproduction apparatus according to the present invention includes one or more pieces of mark information indicating a position on the time axis serving as a boundary of the section of the original audio waveform signal including a plurality of sections, and the original audio waveform signal having a plurality of frequencies. Waveform data storage means for storing the phase information and amplitude value information of the waveform signal for each of the divided bands is provided. The phase information and amplitude value information of each band are acquired for each sampling value obtained by sampling the original audio waveform signal at an appropriate sampling frequency. That is, it is acquired corresponding to the position on the time axis of the original audio waveform signal. First reproduction position information generating means for generating first reproduction position information that represents a position on the time axis of the original audio waveform signal, for example, the position of each sampling value, and changes with time at a desired speed. Is provided. This speed is higher when time compression is performed than when the original audio waveform signal is not compressed and expanded on the time axis, and is slower when time is expanded. Second reproduction position information generating means for generating second reproduction position information that represents the position on the time axis of the phase information and amplitude value information of each band and that changes temporally at a speed different from the desired speed. Is provided. The speed of the second reproduction position information can be a constant speed, for example, the speed when the original audio waveform signal is not compressed or expanded on the time axis. Waveform signal synthesizing means is provided for reading the phase information and amplitude value information from the waveform data storage means according to the second reproduction position information, and synthesizing the reproduced audio waveform signal according to the read phase information and amplitude value information. ing. The second reproduction position information is provided independently for each band. Before the position represented by the first reproduction position information reaches the position represented by the mark information in the waveform data storage means, the loop is obtained for the divided band in which the second reproduction position information has reached the end of the loop section. The second reproduction position information is controlled so as to repeatedly read the section, and when the position represented by the first reproduction position information reaches the position represented by the mark information in the waveform data storage means, Control means is provided for controlling the second reproduction position information so as to change to the second reproduction position information corresponding to the represented position.
[0020]
Further, the waveform signal synthesizing means converts the phase information read from the waveform data storage means into frequency information, inputs the converted frequency information, and has a frequency cycle corresponding to the frequency information. A periodic signal generating means for generating a signal and controlling the phase of the periodic signal generated according to the read phase information in response to the change of the second reproduction position information; and the periodic signal generating And amplitude control means for controlling the amplitude of the periodic signal generated from the means in correspondence with the read amplitude information.
[0021]
In the waveform reproduction apparatus of the present invention, the first reproduction position information changes at a faster speed than when the original waveform signal is reproduced as it is when the time axis compression is performed, and when the expansion is performed, It changes at a slow speed. On the other hand, since the second reproduction position information is read out of the amplitude value information and the phase information of each band at a speed different from that of the first reproduction position information, for example, on the time axis specified by the first reproduction position information. And a position on the time axis of the original audio waveform signal corresponding to the amplitude value information and the phase information specified by the second reproduction position information are caused by deviation (deviation). Then, when the first reproduction position information reaches the position represented by the mark information, that is, when a predetermined time axis compression or expansion is performed, the first reproduction position information and the second reproduction position information are Since the control means controls the second reproduction position information so that the deviation becomes zero, the second reproduction position information corresponds to the phase information and amplitude value information of each band corresponding to the mark information. The waveform is corrected to the position and subsequent waveform reproduction is performed based on the phase information and amplitude value information of each band after the position corresponding to the mark information.
[0022]
When the time axis is extended, before the first reproduction position information reaches the mark information position, the second position information indicates the phase information and amplitude value information corresponding to the mark information position. It may be read out. Therefore, a repetition period is determined for the second reproduction position information before the position where the phase information and amplitude value information corresponding to the position of the mark information is stored, and the first reproduction position information is the mark information. Before reaching the position, phase information and amplitude value information in this repetition period can be repeatedly read out. In this case, when the first reproduction position information reaches the position of the mark information, the second reproduction position information is immediately corrected to a position corresponding to the mark information. Also, when performing time axis compression, when the first reproduction position information reaches the position of the mark information, the second reproduction position information stores the storage position of the phase information and amplitude value information corresponding to the position of the mark information. May not be specified. Also in this case, the second reproduction position information is immediately corrected to the storage position of the phase information and the amplitude value information corresponding to the position of the mark information.
[0023]
DETAILED DESCRIPTION OF THE INVENTION
As shown in FIG. 1, the waveform reproducing apparatus according to one embodiment of the present invention has a waveform memory 2 and a DSP 4, which function as a phase vocoder. The phase vocoder synthesizes a reproduced audio waveform signal based on the waveform data stored in the waveform memory 2 and supplies the synthesized audio waveform signal to the D / A converter 5. The DSP 4 operates according to a program stored in the ROM 8. This program is transferred to the DSP 4 via the CPU 6.
[0024]
The CPU 6 detects the operation state of the operation element 10 according to the program stored in the ROM 8, and controls the DSP 4 according to the detection result. The CPU 6 displays the detected operation state of the operation element 10 and the control state of the DSP 4 based on the operation state of the operation element 10 on the display device 12. The operation element 10 is provided with a reproduction mode switch for setting a reproduction mode, a compression / decompression operation element for setting a parameter representing compression or expansion and its degree, a keyboard having a plurality of keys, and the like. The RAM 14 is used as a working memory for the CPU 6, and various registers and flags to be described later are set.
[0025]
Unlike the above-described phase vocoder, the waveform memory 2 stores amplitude value information and phase information. As is apparent from FIG. 11 and FIG. 12 showing the details of each analysis unit shown in FIG. 11, these pieces of information are obtained by sampling each waveform data signal, for example, one phrase of a musical sound sampled at a predetermined sampling frequency. The value x (n) is divided into a plurality of, for example, m frequency bands having an approximate fundamental period of the waveform data signal as a bandwidth, and the signals in each band are multiplied by the complex frequency at the center of each band, Analysis and development of amplitude information and phase information.
[0026]
As shown in FIG. 2, the DSP 4 includes conversion means provided for each band, for example, time frequency conversion processing units 4a-1 to 4a-m, and periodic signal generation means, for example, cosine oscillators 4b-1 to 4b-. m and amplitude control means, for example, function as waveform signal synthesis means comprising multipliers 4c-1 to 4c-m. Further, the DSP 4 also functions as an adder 4d that adds the outputs of the multipliers 4c-1 to 4c-m, and a multiplier 4e that multiplies the added output of the adder 4d with a gate signal. The adder 4d functions as a combiner, and the multiplier 4e functions as a gate.
[0027]
Each of the time-frequency conversion processing units 4a-1 to 4a-m reads amplitude value information and phase information from the waveform memory 2 in accordance with time expansion / contraction information addr of each band supplied from the control means described later, for example, the control unit 4f. The phase information is converted into instantaneous frequency information by the differentiating means, subjected to interpolation calculation, and the corresponding cosine oscillators 4b-1 to 4b-m are oscillated at the sum frequency with the center frequency ωk of the corresponding band. . In addition, when changing a pitch, the conversion ratio which is frequency conversion information is multiplied. Further, in order to set the phase of the cosine oscillators 4b-1 to 4b-m, the phase information read from the waveform memory 2 is supplied to the corresponding cosine oscillators 4b-1 to 4b-m, and the cosine oscillator 4b- As for 1 thru | or 4b-m, a phase is reset by the phase reset signal from the control part 4f to the supplied phase information. In order to implement these processes, the time-frequency conversion processing units 4a-1 to 4a-m can be realized with a configuration as shown in FIG.
[0028]
In FIG. 13, 100 is a reading means for reading amplitude information and phase information from the waveform memory 2 according to the time expansion / contraction information addr, 102 is a differentiating means for differentiating the read phase information, and 104 is from the differentiating means. Interpolation means for interpolating the obtained instantaneous frequency information with time expansion / contraction information addr, which will be described later, 106 is an interpolation means for interpolating the read amplitude value information with time expansion / contraction information addr, and 108 is the interpolated instantaneous frequency information and center frequency An adding means 110 for adding ωk and a multiplying means 110 for multiplying the output from the adding means 108 by frequency conversion information.
[0029]
The cosine oscillators from the cosine oscillators 4b-1 to 4b-m oscillated as described above are converted into the corresponding time frequency conversion processing units 4a-1 to 4a- in the corresponding multipliers 4c-1 to 4c-m. Amplitude modulation is performed according to the amplitude information from m. Thus, the original audio waveform signal can be reproduced by synthesizing the signals generated for each band by the adder 4d. A gate signal for controlling the open / closed state of the gate 4e is supplied from a gate signal generation unit 4g included in the control unit 4f. The gate signal takes a value between 0 and 1, for example, and immediately rises from 0 to 1 when rising, but gradually decreases from 1 toward 0 when falling.
[0030]
The waveform memory 2 stores various parameters including the above-described amplitude value information and phase information. That is, as shown in FIG. 4A, the waveform memory 2 has a waveform information area, a band waveform information area, and a band waveform data area.
[0031]
As shown in FIG. 5B, WaveStart, WaveEnd, Mark (1), Mark (2), Mark (3), and Mark (4) are stored in the waveform information area. Wavestart samples the original audio waveform signal shown in the upper part of FIG. 3 at a predetermined frequency and stores it in the waveform memory 2 (actually not stored in the waveform memory 2). WaveEnd is an address that is also an end address.
[0032]
If Mark (1) to Mark (4) store the sampling values of the original audio waveform signal, for example, in the case of a musical sound, the boundary of each section such as an attack part, a steady part, an attenuation part, and a silent part. In the case of an audio signal, this is the address of the portion that becomes the boundary of each section such as each syllable. These Mark (1) to Mark (4) correspond to mark information. In this embodiment, since the example in which the original audio waveform signal is composed of three sections is shown, there are four pieces of mark information. However, in actuality, the number of pieces of mark information depends on the number of sections of the original audio waveform signal. Increase or decrease.
[0033]
As shown in FIG. 4 (c), the band waveform data area is provided for each of m frequency bands having the approximate basic period of the original audio waveform signal in the bandwidth, as shown in FIG. 4 (d). The amplitude value information and the phase information described above are stored for each band. The amplitude value information and the phase information are stored for each sampling value of the original audio waveform signal, as shown in FIG.
[0034]
The band waveform information area is also provided corresponding to m bands as shown in FIG. 8F, and wave_start, wave_end, mark (1), as shown in FIG. altS (1), altE (1), mark (2), altS (2), altE (2), mark (3), altS (3), altE (3), and mark (4) are stored.
[0035]
wave_start represents a start address among addresses where amplitude value information and frequency information in each band of each sampling value are stored, and wave_end represents an end address. The mark (1) to mark (4) correspond to the boundaries Mark (1) to Mark (4) of each section described above, and amplitude value information and frequency information of sampling values at these positions are stored. Represents an address.
[0036]
altS (1) to altS (3) represent addresses where amplitude value information and phase information at the start point of the loop period in each section are stored, and altE (1) to altE (3) are loop period information. It represents an address where amplitude value information and phase information at the end point are stored. In the loop period, when the time axis is expanded as will be described later, for example, the amplitude value information and the phase information of the final addresses altE (1) to altE (3) of the loop period are read and the waveform synthesis is performed. This is for performing waveform synthesis based on the amplitude value information and the phase information in the loop period when the expansion is not completed.
[0037]
In this embodiment, since the number of sections is 3, the number of marks is 4, the number of altS is 3, and the number of altE is 3. However, if the number of sections is different from 3, The number also changes depending on the number of sections. Further, as apparent from FIG. 3, the position and length (cycle) of the loop period are different for each band.
[0038]
Next, an outline of the operation of the waveform reproducing apparatus will be described with reference to FIG. In the embodiment of the present invention, amplitude information and phase information are stored in the band waveform data area of the waveform memory 2, but in order to make it easy to understand a specific phenomenon (change in pitch), FIG. In FIG. 3, the phase information is changed to frequency information. Actually, the storage information of each band stores phase information P (m) and amplitude information A (m) as shown as a certain band Band (m) in FIG. In this waveform reproducing apparatus, an address counter SPHASE that generates an address (first reproduction position information) of each sampling point of the original audio waveform signal in the DSP 4 and an address (phase information and amplitude value information of each band) are stored. M address counters addr for generating (second reproduction position information) are used.
[0039]
The value of SPHASE increases from WaveStart. The increment tcomp is set according to the degree of compression or expansion set by the compression / decompression operation element included in the operation element 10. For example, in the case of compression, the increment tcomp is set to a value larger than 1, for example, when compression / expansion is not performed, and is set to a value smaller than the increment 1 when compression / expansion is not performed. The Further, each addr increases from mark (1), but the increment rate is 1, which is an increment in the case of normal reproduction without compression / decompression, for example, unlike the SPHASE increment tcomp. Note that the increment rate may be arbitrarily set by an operator.
[0040]
According to the value of each addr, phase information and amplitude value information of each band is read from the waveform memory 2 and supplied to the corresponding frequency time conversion processing units 4a-1 to 4a-m to recombine the waveforms.
[0041]
When SPHASE and each addr generate an address at the same period as the sampling period of the original audio waveform signal, for example, when tcomp is greater than 1, that is, in the case of time compression, SPHASE sets Mark (2). When specified, each addr has not yet reached the position of mark (2) corresponding to Mark (2). At this time, a deviation occurs between the values of each addr and SPHASE. The value of each addr is forcibly corrected to mark (2) so as to make this deviation 0, and at the same time, a phase reset signal is supplied from the control unit 4f to each cosine oscillator 4b-1 to 4b-m. The respective phase information in each mark (2) is set in each cosine oscillator 4b-1 to 4b-m. As a result, the audio waveform signal time-compressed based on the phase information and amplitude value information of each band corresponding to the section starting from Mark (2) is re-synthesized, and the re-synthesized sound becomes unnatural. Absent. Hereinafter, similarly, the time-compressed waveform is recombined in the section after Mark (2).
[0042]
Similarly, when tcomp is smaller than 1, that is, when the time axis is extended, even if each addr reaches mark (2), SPHASE does not reach Mark (2). Therefore, a loop period is provided for each band, and when the value of each addr reaches the address alt_E at the end of each loop period (the end point of the loop period of each band is collectively referred to as alt_E), To alt_S (the beginning of the loop period of each band is collectively referred to as alt_S), phase information and amplitude value information are read in the reverse direction, and the waveform is reproduced. If SPHASE does not reach Mark (2) even after returning to alt_S, the phase information and amplitude value information are read in the forward direction toward alt_E, and the waveforms are recombined.
[0043]
When SPHASE reaches Mark (2) during reading in such a loop period, there is also a deviation between the SPHASE value and each addr value. Therefore, the value of each addr is forcibly corrected to mark (2) corresponding to Mark (2) so that the deviation becomes 0, and at the same time, the phase reset is performed on each cosine oscillator 4b-1 to 4b-m. A signal is supplied to set each phase information in each mark (2) to each cosine oscillator 4b-1 to 4b-m. As a result, the recombination of the time-extended waveform in the section starting from Mark (2) is started based on the phase information and amplitude value information of mark (2) of each band corresponding to Mark (2). Therefore, the start timing of the section starting from Mark (2) is the same for each band, and the re-synthesized sound does not become unnatural. The time axis is similarly extended in the section after Mark (2).
[0044]
In the case of such compression / decompression, since the increment rate of each addr is 1, re-synthesis of the audio waveform signal is performed by reading each sampling value of the audio waveform signal at the sampling frequency. Therefore, for example, when the original audio waveform signal is a musical sound signal, the attack portion is not compressed and expanded, and can be re-synthesized without impairing the sound quality of the attack portion of the original waveform signal, resulting in an unnatural musical sound. Absent.
[0045]
Hereinafter, operations of the DSP 4 and the CPU 6 will be described in detail. FIG. 5 shows the main routine of the CPU 6. The CPU 6 mainly detects the state of the operation element 10, performs various settings according to the detected state, and displays the detection state and the setting state on the display device 12.
[0046]
First, it is determined whether or not the playback mode switch included in the operation element 10 has changed (step S2). There are mode 1 and mode 2 as playback modes. Mode 2 is a mode in which the timing of the frequency information and amplitude value information of each band is matched when re-synthesis of a new section of the original audio waveform signal as described above is started. Mode 1 is such a timing. This mode does not correct.
[0047]
If it is determined that there is a change in the playback mode switch, it is determined whether the changed playback mode switch is mode 1 or mode 2 (step S4). If the playback mode switch is mode 1, the playback mode is set to 1 (step S6). If the playback mode switch is mode 2, the playback mode is set to 2 (step S8). Is changed to the set mode (step S10).
[0048]
Subsequent to step S10 or when it is determined in step S2 that there is no change in the playback mode switch, it is determined whether the compression / decompression operator included in the operator 10 has changed (step S12). If it is determined that there is a change in the compression / decompression operator, the compression / decompression parameter value tcomp is set corresponding to the operation amount of the compression / decompression operator (step S14). Next, the display of the value of tcomp on the display device 12 is updated (step S16).
[0049]
Subsequent to step S16 or when it is determined in step S12 that there is no change in the compression / decompression operator, it is determined whether or not the key operator of the keyboard included in the operator 10 has been operated (step S18). . If it is determined that the key operator has changed, whether the change of the key operator is off to on (sounding starts), on to off (sounding stops), or on to on (legato performance). Is determined (step S20). If it is determined that the change is from off to on, the frequency conversion information is calculated from the note number that is the MIDI data corresponding to the pressed key, and the frequency conversion information is transferred to the DSP 4 (step S21). Next, an instruction is sent to the DSP 4 to execute the DSP sound generation start process (step S22). If it is determined in step S20 that the change is from on to on, it means that the key has been played legato, and new frequency conversion information is calculated from the note number which is MIDI data corresponding to the new key press. Then, the frequency conversion information is transferred to the DSP 4 (step S23). Similarly, if it is determined that the change is from on to off, an instruction is sent to the DSP 4 to execute DSP sound generation stop processing (step S24). This completes the main routine. The main routine is repeatedly executed every predetermined period.
[0050]
In the DSP sound generation start process, initial setting is performed (step S26). That is, as shown in FIG. 6, the value of the address counter SPHASE that specifies the address of each sample value of the original audio waveform signal is set in the WaveStart that is the start address. Next, the value of the register mk that stores the mark information at the head of the section to which the address currently represented by SPHASE belongs is set to Mark (1). Then, after the initial setting, the value of the register mk_old that stores the mark information indicating the head of the section immediately before the section to which the address currently generated by the address counter SPHASE belongs is Mark (1). Next, the value of the counter n for designating each mark information is set to 1. A flag key_on indicating whether or not sound generation has started is set to 1 to indicate that sound generation has started. Then, a flag dir for designating a direction for reading the amplitude value information and frequency information in the loop period is set in the forward direction.
[0051]
Then, the DSP2 interrupt is permitted (step S28). This completes the DSP sound generation start processing routine.
[0052]
As shown in FIG. 7, the DSP sound generation stop processing routine sets key_on to 0 in order to stop sound generation (step S30). Next, the gate signal generator 4f is instructed to fall the gate signal (step S32). As a result, the output of the gate 4e is gradually reduced and muted.
[0053]
The DSP interrupt processing routine is executed each time a clock signal having a frequency equal to the sampling frequency of the original audio waveform signal is generated by a clock generator (not shown). In this routine, as shown in FIG. 8, first, a control main process is executed (step S34). The control main process will be described later.
[0054]
Next, the value of the counter j for designating each band is set to 1 (step S36), and the time axis conversion processing of the band designated by the counter j is performed as described later (step S38). This time axis conversion process will also be described later. It is determined whether the value of the counter j is greater than m (step S40). If it is greater than m, the interrupt process is terminated. If it is not greater than m, the value of the counter j is incremented by one and step S38 is executed. Then, it is determined whether a flag syl_flg described later is 1 (step S39). When the flag syll_flg is 1, the phase reset signal is supplied to the cosine oscillators 4a-1 to 4a-m to instruct to reset the phase, and the interrupt processing routine of the DSP 4 is ended (step S41). If it is determined in step S39 that the flag syl_flg is not 1, the interrupt processing routine of the DSP 4 is terminated.
[0055]
The DSP control main process is mainly for advancing the value of SPHASE by tcomp each time a clock signal is generated. As shown in FIG. 9, in the DSP control main process, a flag that is set to 1 only when the address counter SPHASE reaches any one of Mark (1) to (4), that is, when it reaches the head of each section. It is first determined whether syl_flag is other than 0 (step S42). If it is other than 0, this is set to 0 (step S44). Following this, or when it is determined in step S42 that syl_flag is not 0, that is, 0, it is determined whether the value of the address counter SPHASE is greater than or equal to mk (step S46). Initially, in the DSP sound generation start processing, SPHASE is set to WaveStart and mk is set to Mark (1). Therefore, when step S46 is executed for the first time, it is determined that the value of the address counter SPHASE is greater than or equal to mk. .
[0056]
If the value of the address counter SPHASE is greater than or equal to mk, it is determined whether SPHASE is greater than or equal to WaveEnd, that is, whether the last address of the section has been designated. If the final address is not specified, syl_flag is set to 1, and the value of the counter n is incremented by 1 (step S50). That is, SPHASE is located at the head of a certain section.
[0057]
Next, mk is updated to Mark (n) (step S52). Thereby, the value of mk represents the head of the next section. The gate signal generator 4g is instructed to rise the gate signal (step S54). As a result, a reproduction waveform can be output from the gate 4e.
[0058]
Then, it is determined whether key_on is 0 (step S56). The key_on becomes 0 only when the DSP sound generation stop process described above is executed. If key_on is not 0, the value of the address counter SPHASE is increased by tcomp (step S58), and this control main process is terminated.
[0059]
If key_on is 0, it is determined whether the gate signal is 0 (step S60). Since the gate signal is gradually lowered after being instructed to fall, it is determined whether the gate signal is actually 0 and no output is generated from the gate 4e. If it is determined that the gate signal is not 0, step S58 is executed. If the gate signal is 0, the DSP interrupt process is stopped (step S62), and the process returns. Therefore, the interrupt process is not executed even if the clock signal is generated until the interrupt is permitted again.
[0060]
If it is determined in step S48 that SPHASE is equal to or greater than WaveEnd, step S62 is immediately executed.
[0061]
If it is determined in step S46 that SPHASE is not mk or more, that is, if it is determined that the head of the next section has not been reached, it is determined whether the value of SPHASE is mk−α or more. That is, it is determined whether the SPHASE has reached the previous address by a predetermined address distance α from the head of the next section (step S64). α is used to determine the position where the reproduction synthesized sound starts to be muted immediately before the end of each section. If it is determined that the SPHASE value is not greater than or equal to mk-α, steps S56, S60, S58 or S56, S60, S62 are executed.
[0062]
If the SPHASE value is greater than or equal to mk−α, it is determined whether the value of mk is not equal to mk_old (step S66). When SPHASE designates the head address of a certain section, mk is updated to the head address of the next section by step S52. Therefore, when step S66 is executed for the first time, mk does not match mk_old. Accordingly, the gate signal generation unit 4g is instructed to fall the gate signal (step S68). Then, mk_old is updated to the value of mk (step S70), and step S56 and subsequent steps are executed. Accordingly, when step S66 is executed after the second time, step S56 and subsequent steps are executed immediately after step S66.
[0063]
In this manner, in the DSP control main routine, SPHASE is incremented by tcomp every time a clock signal is generated, and only when SPHASE designates the head address of each section, syl_flag is set to 1, enabling output from the gate 4e, and SPHASE. When the address of mk−α is designated for the first time, the gate signal starts falling.
[0064]
As shown in FIG. 10, in the DSP time axis conversion process, it is determined whether the set reproduction mode is 1 or 2 (step S72). In the reproduction mode 1, the addr of the band designated by the j counter is increased by tcomp (step S74), and the addr is transferred as time expansion / contraction information to each time frequency conversion processing unit (step S76).
[0065]
If the playback mode 2 is set, it is determined whether syl_flag is 1, that is, whether the head address of any section is specified (step S78). If the head address of any section is specified, addr is updated to mark (n−1) in order to set the head address of the section to addr (step S80).
[0066]
This is because, when syll_flag is set to 1 in step S50 of the DSP control main process (when SPHASE reaches the beginning of any section of the original audio waveform signal), n is advanced by 1, so addr is marked ( If n), the address at which the amplitude value information and the phase information at the head of the section next to the section specified by SPHASE are stored is designated. Therefore, an address where amplitude value information and phase information are stored at the head of the section of the original audio waveform signal designated by SPHASE is designated as mark (n−1).
[0067]
Thus, when SPHASE designates the head of a certain section, addr is also forcibly corrected to the address where the amplitude value information and phase information of the head of the section are stored. Regardless of whether the decompression is performed, the reproduction synthesis of the new section is started by the phase information and the amplitude value information of each band of the head sampling value of the section.
[0068]
Next, the alt_start value for storing the start address of the loop period is designated as altS (n-1) of the band designated by the j counter, and the alt_end value for storing the address at the end of the loop period is designated by the j counter. The bandwidth is updated to altE (n-1) (step S82). This is also for storing the beginning and end of the loop period of the section to which the address specified by addr belongs. Next, dir is set in the forward direction (step S84), and step S76 is executed.
[0069]
If it is determined in step S78 that syll_flag is not 1, SPHASE does not designate the head of the section, so it is determined whether dir is in the forward direction (step S86). In the forward direction, addr is increased by a predetermined rate (step S88). As the rate, for example, 1 is used as described above. Therefore, according to the addr transferred in step S76, the amplitude information and the phase information are read out by the time-frequency conversion processing unit at a normal speed.
[0070]
It is determined whether addr is greater than or equal to alt_end (step S90). If not greater than alt_end, step S76 is executed. If addr is greater than or equal to alt_end, an address that exceeds the end of the loop period is specified, so the value of addr is subtracted from alt_end by the difference between the current addr value and alt_end (addr-alt_end). (Step S92). Next, dir is set in the reverse direction (step S94), and step S76 is executed.
[0071]
The calculation of step S92 changes the read direction of addr in the reverse direction in step S94. At this time, in the forward direction, by the amount the addr has advanced beyond alt_end, in the reverse direction from the address returned from alle_end, This is for reading the frequency information and the amplitude value information.
[0072]
If it is determined in step S86 that the direction is opposite, addr is decreased by rate (step S96). It is determined whether addr is less than or equal to alt_start (step S98). If not more than alt_start, step S76 is executed. If addr is less than or equal to alt_start, the address exceeding the beginning of the loop period is specified, so the value of addr is corrected to a value that increases alt_start by the difference between the current addr value and alt_start (addr + alt_start). (Step S100). Next, dir is set in the forward direction (step S102), and step S76 is executed.
[0073]
In step S100, the addr reading direction is changed in the forward direction in step S102, but the frequency information and amplitude value information from the address returned from ale_start in the forward direction by the amount that the addr has advanced beyond alt_start in the reverse direction. It is for reading out.
[0074]
Since the waveform is reproduced in this way, when the time axis compression / expansion of the original audio waveform signal is performed, when re-synthesizing the head part of each section of the original audio waveform signal, Since the waveform reproduction is performed based on the phase information and the amplitude value information, the sound quality of the synthesized waveform can be improved.
[0075]
In the above embodiment, the playback mode 1 and the playback mode 2 are provided. However, the playback mode 1 is not necessary depending on circumstances. When the reproduction mode 1 is not provided, the waveform memory 2 does not have to store amplitude value information and phase information for the period α in each section.
[0076]
【The invention's effect】
As described above, according to the present invention, when the first reproduction position information reaches the position of the mark information, the second reproduction position information is corrected to the first reproduction position information. The time-axis compression / decompression of the audio waveform signal is performed based on the phase information and amplitude value information corresponding to the mark information, and the time-axis compression / decompression of the audio waveform signal thereafter is similarly performed based on the corresponding phase information and amplitude value information. As it is done, the sound quality is improved.
[Brief description of the drawings]
FIG. 1 is a block diagram of a waveform reproducing device according to a first embodiment of the present invention.
FIG. 2 is a block diagram of a phase vocoder realized by a DSP in the waveform reproduction apparatus of FIG.
3 is a diagram showing frequency information and amplitude value information of each band stored in the waveform memory 2 of the waveform reproducing device of FIG. 1 and an audio waveform signal from which these are generated. FIG.
4 is a diagram showing various data stored in a waveform memory 2. FIG.
FIG. 5 is a flowchart of a main routine executed by the CPU of FIG.
FIG. 6 is a flowchart of a DSP sound generation start processing routine executed by the DSP.
FIG. 7 is a flowchart of a DSP sound generation stop processing routine executed by the DSP.
FIG. 8 is a flowchart of an interrupt processing routine executed by the DSP.
FIG. 9 is a flowchart of a control main process in the interrupt process routine of FIG.
10 is a flowchart of time axis conversion processing in the interrupt processing routine of FIG.
11 is a block diagram of a configuration for storing amplitude value information and phase information in a waveform memory in the waveform reproducing device of FIG. 1; FIG.
12 is a detailed block diagram of an analysis unit in FIG. 11. FIG.
13 is a detailed block diagram of a time-frequency conversion processing unit in the waveform reproduction device of FIG.
14 is a diagram showing phase information and amplitude value information stored as band waveform data in a waveform memory in the waveform reproducing device of FIG. 1; FIG.
FIG. 15 is a block diagram of a conventional phase vocoder.
16 is a detailed block diagram of the analysis unit of FIG.
FIG. 17 is a diagram illustrating waveform analysis performed by the phase vocoder of FIG. 15;
18 is a detailed block diagram of the time-frequency conversion processing unit in FIG. 15 and an explanatory diagram of time expansion and compression in the time-frequency conversion processing unit in FIG.
[Explanation of symbols]
2 Waveform memory (waveform data storage means)
4 DSP (first reproduction position information generation means, second reproduction position information generation means, waveform signal synthesis means, control means).

Claims

One or more pieces of mark information indicating a position on the time axis that is a boundary of the section of the original audio waveform signal composed of a plurality of sections, and each divided band obtained by dividing the original audio waveform signal into a plurality of frequency bands Waveform signal phase information and amplitude value information are stored in correspondence with the position on the time axis of the original audio waveform signal, and the waveform signal for each of the divided bands is a waveform in which different loop sections are set. Data storage means;
First reproduction position information generating means for generating first reproduction position information representing a position on the time axis of the original audio waveform signal and changing in time at a desired speed;
Second reproduction position information that represents the position of the phase information and amplitude value information on the time axis and changes with time at a speed different from the desired speed is generated for each waveform signal in the divided band. Second reproduction position information generating means;
Waveform signal synthesizing means for reading out the phase information and amplitude value information from the waveform data storage means according to second reproduction position information, and synthesizing a reproduction audio waveform signal according to the read phase information and amplitude value information;
Before the position represented by the first reproduction position information reaches the position represented by the mark information in the waveform data storage means, the loop is obtained for the divided band in which the second reproduction position information has reached the end of the loop section. The second reproduction position information is controlled so as to repeatedly read the section, and when the position represented by the first reproduction position information reaches the position represented by the mark information in the waveform data storage means, the mark information Control means for controlling the second reproduction position information so as to change to the second reproduction position information corresponding to the represented position;
A waveform reproducing apparatus provided.

2. The waveform generator according to claim 1, wherein the waveform signal synthesizing means is
Conversion means for converting phase information read from the waveform data storage means into frequency information;
The converted frequency information is input to generate a periodic signal having a frequency corresponding to the frequency information, and the phase of the periodic signal generated according to the read phase information is the second reproduction position information. Periodic signal generating means that is changed and controlled in response to the change of
Amplitude control means for controlling the amplitude of the periodic signal generated from the periodic signal generating means in correspondence with the read amplitude information;
A waveform reproducing apparatus provided.