JP3562341B2

JP3562341B2 - Editing method and apparatus for musical sound data

Info

Publication number: JP3562341B2
Application number: JP27583898A
Authority: JP
Inventors: 秀雄鈴木; 真雄坂間
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 1997-09-30
Filing date: 1998-09-29
Publication date: 2004-09-08
Anticipated expiration: 2018-09-29
Also published as: JP2000122663A

Description

【０００１】
【発明の属する技術分野】
この発明は、アーティキュレーションを有する高品質な楽音波形の合成を行なうことができる技術に関し、特に、その場合の楽音データの編集方法及び装置に関するもので、電子楽器に限らず、ゲーム機やパーソナルコンピュータその他マルチ・メディア機器等、各種の用途の楽音又はサウンド発生機器における楽音データの編集装置及び／又は方法として広範囲に応用できるものである。
なお、この明細書において、「楽音」とは、音楽の音に限られるものではなく、人声音や各種効果音、自然界にある音など、音（サウンド）一般を含む広義の概念で用いるものとする。
【０００２】
【従来の技術】
電子楽器などに用いられている波形メモリ読み出し方式（ＰＣＭ：パルス符号変調方式）の音源においては、所定の音色に対応する１又は複数周期の波形のデータをメモリに記憶しておき、この波形データを発生しようとする楽音の所望の音高（ピッチ）に対応する所望の読出し速度で繰返し読み出すことにより、持続的な楽音波形を生成することが行われている。また、楽音の発音開始から終了までの全波形のデータをメモリに記憶しておき、この波形データを発生しようとする楽音の所望の音高（ピッチ）に対応する所望の読出し速度で読み出すことにより、１つの音を発音生成することも行われている。
この種のＰＣＭ音源において、メモリに記憶した波形を単にそのまま読み出したものを楽音として発生するだけではなく、何らかの変更を加えて、発生楽音に表現力を持たせようとする場合、音高、音量、音色という３つのカテゴリの楽音要素に関して制御を行うことが従来より為されている。音高に関しては、任意のピッチエンベロープに従って読み出し速度を適宜変調することにより、ビブラートやアタックピッチ等のピッチ変調効果を付与することが為される。音量に関しては、読み出した波形データに対して所要のエンベロープ波形に従う音量振幅エンベロープを付与することや、読み出した波形データの音量振幅を周期的に変調制御することによりトレモロ効果等を付与することなどが為される。また、音色に関しては、読み出した波形データをフィルタ処理することにより、適当な音色制御がなされる。
【０００３】
また、実際に生演奏された連続的な演奏音（フレーズ）を一括してサンプリングして１つの記録トラックに貼り付け（記録し）、こうして複数のトラックに貼り付けた各フレーズ波形を、別途記録したシーケンス演奏データに基づく自動演奏音と共に組み合わせて再生発音するようにしたマルチトラックシーケンサも知られている。
また、実際に生演奏された１曲の楽音波形データ全部をＰＣＭデータにて記録し、これを単純に再生するものは、ＣＤ（コンパクトディスク）における音楽記録方式としてよく知られている。
【０００４】
【発明が解決しようとする課題】
ところで、ピアノ、バイオリン、サックス等の任意の自然楽器についての熟練した演奏家が該楽器によって一連の楽曲フレーズを演奏する場合、その演奏音の内容は、たとえ同じ楽器で演奏されているとはいえ、一様なものではなく、各音毎に、あるいは音と音のつながりにおいて、あるいは音の立上り部や持続部または立下り部等の部分において、曲想に応じてあるいは演奏家の感性等に応じて、微妙に異なる“アーティキュレーション”で演奏される。そのような“アーティキュレーション”の存在が、聴者に本当に良い音としての印象を与える。
ＣＤにおける音楽記録方式のように、熟練した演奏家が行った音楽演奏を全部そっくりＰＣＭ波形データとして記録する方式は、生演奏のリアルで高品質な再生が可能であるから、演奏家が演奏した通りの“アーティキュレーション”をリアルに再現することができる。しかし、決まった曲（記録した通りの曲）の単なる再生装置としてしか利用することができないものであるため、電子楽器やマルチメディア機器等においてユーザーの自由な音作りや編集操作を許容するインタラクティブな楽音作成技術としては利用することができない。
【０００５】
これに対して、電子楽器等で公知のＰＣＭ音源技術においては、上述のように、ユーザーによる音作りを許容するものであり、発生楽音に或る程度の表現力を持たせることができるものである。しかし、音質と表現力の両面において、自然な“アーティキュレーション”を実現するには、不十分なものであった。例えば、一般にこの種のＰＣＭ音源技術においては、メモリに記憶する波形データは、自然楽器で演奏した単音をサンプリングしたものを記憶するだけであるので、発生楽音の音質に限度があった。特に、演奏時における音と音のつながりのアーティキュレーション若しくは奏法を高品質に表現することはできなかった。例えば、先行する音からその次の音に滑らかに変化させるようなスラー奏法の場合、従来の電子楽器等では、単にメモリからの波形データ読み出し速度を滑らかに変化させたり、発生音に付与する音量エンべロープを制御する等の手法に頼っているにすぎず、自然楽器の生演奏に匹敵するような音質のアーティキュレーション若しくは奏法を実現することはできなかった。また、同じ楽器の同じ音高の音であっても、曲フレーズの違いに応じて、あるいは同じ曲フレーズであっても演奏機会の違い等に応じて、その立上り部等の部分において異なるアーティキュレーションを示すことがあるが、そのような微妙なアーティキュレーションの違いを表現することも、電子楽器等で公知のＰＣＭ音源技術においては実現することができなかった。
【０００６】
また、演奏表現に応じた発生楽音の制御も、従来の電子楽器等においては比較的単調なものであり、十分とは言えなかった。例えば、鍵等の演奏タッチに応じた楽音制御を行うことが知られているが、その場合も、タッチに応じて音量の変化特性や音色フィルタの特性を制御することができる程度にすぎず、例えば楽音の立ち上がりから立ち下がりまでの全発音区間のうちの各部分的区間毎に楽音特性の制御を自由に行うようなことはできなかった。また、発生音の音色制御に関しては、演奏に先立って一旦１つの音色が選択されると、その選択された音色に対応する波形データがメモリから読み出され、以後、発音中は様々な演奏表現に応じて該音色に対応する波形データがフィルタ等で可変制御されるだけであったので、演奏表現に応じた音色変化が十分ではなかった。また、ピッチや音量等の制御エンベロープ波形は、エンベロープの立ち上がりから立ち下がりまでの一連のエンベロープを１単位としてその形状等の設定制御がなされており、部分的にエンベロープを入れ替える等の操作が自由に行えるようにはなっていない。
【０００７】
一方、上記マルチトラックシーケンサのような方式では、生演奏のフレーズ波形データを貼り付けるだけであったので、フレーズ波形の部分的な編集処理（部分的差し替えや特性制御など）を行うことは全くできず、これも、電子楽器やマルチメディア機器等においてユーザーの自由な音作りを許容するインタラクティブな楽音作成技術としては利用することができなかった。
また、音楽的な演奏音に限らず、自然界に存在する一般的な音も、その時間的経過等に従って、繊細な“アーティキュレーション”を豊富に含んでいるが、従来の技術では、自然界に存在する音の“アーティキュレーション”を制御可能に巧みに再現することはできなかった。
【０００８】
この発明は上述の諸点に鑑みてなされたもので、電子楽器や電子的装置を用いて楽音（前述の通り音楽的な音に限らずその他の一般的な音をも含む）を発生する場合において、“アーティキュレーション”のリアルな再現を実現すると共にその制御を容易にし、電子楽器やマルチメディア機器等においてユーザーの自由な音作りと編集操作を許容するインタラクティブな高品質楽音作成技術を提供し、そのような技術に基づく新規な楽音データ編集方法及び装置を提供することを目的とする。
なお、本明細書において“アーティキュレーション”（ａｒｔｉｃｕｌａｔｉｏｎ）の語は、通常知られている意味で用いるものとし、例えば、「音節」、「音と音のつながり」、「複数の音のかたまり（フレーズ）」、「音の部分的な特徴」、「発音の手法」、「奏法」、「演奏表現」等の概念を全て含む広い概念で用いるものとする。
【０００９】
【課題を解決するための手段】
請求項１に係るデータ検索方法は、音楽的なアーティキュレーションを伴う複数の演奏フレーズのそれぞれについて、該演奏フレーズを構成する１又は複数の音を複数の部分的時間区間に分割し、各部分的時間区間毎のアーティキュレーションエレメントを順次指示するアーティキュレーションエレメントシーケンスを記憶する第１のデータベース部と、様々なアーティキュレーションエレメントに対応する部分的音波形を表現するテンプレートデータを記憶する第２のデータベース部とを具備する楽音データベースのためのデータ検索方法であって、前記第１のデータベース部では前記アーティキュレーションエレメントシーケンスに付属してその特徴を示す属性情報を記憶しており、所望の奏法に応じて属性情報を指定する第１ステップと、指定された属性情報により前記所望の奏法に該当するアーティキュレーションエレメントシーケンスを前記第１のデータベース部から検索する第２ステップとを具備することを特徴とする。
【００１０】
これにより、楽音データの編集作業にあたって、ユーザーは、所望の奏法を指定することで、指定された奏法に該当するアーティキュレーションエレメントシーケンスの有無を検索することができ、該所望の奏法が前記楽音データベースで利用可能か否かをサーチするできる、という優れた効果を奏する。従って、例えば、所望の奏法が楽音データベースで利用可能であれば、それを読み出して、その内容を確認し、望みのものと異なっていればその内容を適宜修正・変更することで望みの楽音データを作成するようにすることが行なえる。一方、所望の奏法が楽音データベースで利用可能でない場合は、該奏法に対応する楽音データを新規に作成して楽音データベースに登録する、といったような編集作業を行なうこともできる。このように、所望の奏法を指定するといった演奏感覚に合った形態で、楽音データの編集作業を進めることができ、ユーザーにとって使い易いものとなる。
【００１２】
更に、前記第４ステップによって編集されたアーティキュレーションエレメントシーケンスにおいて、編集されたアーティキュレーションエレメントとそれに隣接するアーティキュレーションエレメントとの間の各テンプレートデータの接続の仕方を設定する第５ステップを更に具備してもよい。これによれば、編集操作の結果、編集対象のアーティキュレーションエレメントに対応する部分的音波形を表現するテンプレートデータの内容が変更されることにより、それに隣接するアーティキュレーションエレメントとの間でのテンプレートデータのつながりが損なわれるおそれが生じるところ、上記第５ステップにより、隣接するテンプレートデータの接続の仕方を設定する（つまり定義し直す）ことにより、隣接するテンプレートデータを滑らかにつなげることができる。この場合、テンプレートデータを滑らかにつなげるための接続の仕方には複数の態様があり、そのうちから適切な接続の仕方を選択的に設定するようにすることができる。
【００１３】
この発明に係る楽音データ作成及び楽音合成の技術は、音のアーティキュレーションを分析し、アーティキュレーションエレメントを単位として楽音編集及び合成処理を行うことにより、音のアーティキュレーションをモデルして楽音合成を行うものである。従って、この技術をＳＡＥＭ（ＳｏｕｎｄＡｒｔｉｃｕｌａｔｉｏｎＥｌｅｍｅｎｔＭｏｄｅｌｉｎｇ）技術と呼ぶことにする。
この発明は、方法発明として構成し、実施することができるのみならず、装置発明として構成し、実施することもできる。また、この発明は、コンピュータプログラムの形態で実施することができるし、そのようなコンピュータプログラムを記憶した記録媒体の形態で実施することもできる。更に、この発明は、新規なデータ構造からなる波形又は楽音データを記憶した記録媒体の形態で実施することもできる。
【００１４】
【発明の実施の形態】
以下、添付図面を参照してこの発明の実施の形態を詳細に説明しよう。
〔楽音データベースの作成例〕
前述の通り、ピアノ、バイオリン、サックス等の任意の自然楽器についての熟練した演奏家が該楽器によって一連の楽曲フレーズを演奏する場合、その演奏音の内容は、例えば同じ楽器で演奏されているとはいえ、一様なものではなく、各音毎に、あるいは音と音のつながりにおいて、あるいは音の立上り部や持続部または立下り部等の部分において、曲想に応じてあるいは演奏家の感性等に応じて、微妙に異なる“アーティキュレーション”で演奏される。そのような“アーティキュレーション”の存在が、聴者に本当に良い音としての印象を与える。
楽器演奏の場合、一般に、“アーティキュレーション”は、演奏家による「奏法」若しくは「演奏表現」の反映として顕れる。従って、以下の説明では、「奏法」若しくは「演奏表現」と“アーティキュレーション”の語がどちらも実質的に同義のことを指して使用されることがあることを予めことわっておく。たとえば、「奏法」には、スタカート、テヌート、スラー、ビブラート、トレモロ、クレッシェンド、デクレッシェンドなど、その他様々なものがある。演奏家が楽器によって一連の楽曲フレーズを演奏する場合、楽譜の指示に従って、あるいは自らの感性に従って、各演奏局面で様々な奏法が使用され、それぞれの奏法に応じた“アーティキュレーション”を生み出す。
【００１５】
この発明に従う楽音データベースの作成手順の一例が図１に示されている。
最初のステップＳ１は、１又は複数の楽音からなる一連の演奏音をサンプリングするステップである。ここでは、例えば、或る特定の自然楽器についての熟練した演奏家が、該楽器によって所定の一連の楽曲フレーズを演奏する。この一連の演奏音をマイクロフォンでピックアップし、所定のサンプリング周波数にしたがってサンプリングし、該演奏フレーズ全体についてのＰＣＭ符号化された波形データを得る。この波形データは、音楽的にも優れた、高品質なデータである。説明のために、このステップＳ１でのサンプリングのために演奏される一連の楽曲フレーズの楽譜例を、図２（ａ）に示す。図２（ａ）の楽譜の上側に付記された「奏法記号」は、この楽譜に示された楽曲フレーズがどのような奏法で演奏されるかを例示的に示すものである。このような「奏法記号」付きの楽譜は、このステップＳ１でのサンプリングの際に不可欠なものではない。通常の楽譜に従って演奏家が該楽曲フレーズを演奏し、その後での、サンプリングした波形データの分析によって、時間経過に従う各演奏局面での奏法を判断し、このような奏法記号付きの楽譜を作成するようにしてよい。追って説明するように、このような奏法記号付きの楽譜は、ステップＳ１でのサンプリングの際に役立つというよりは、むしろ、ここでサンプリングしたデータに基づいて作成されたデータベースから一般のユーザーが所望のデータを引き出しそれらを接続して所望の演奏音を作成する際に、該一般ユーザーにとって大いに手助けとなると思われるものである。しかし、図２（ａ）の楽譜で示すフレーズがどのように演奏されたかを例示的に説明するために、同図において例示された奏法記号の意味についてここで説明しておく。
【００１６】
最初の小節における３つの音符に対応して描かれた黒丸の奏法記号は「スタカート」奏法を示し、黒丸の大きさは音量の程度を示している。
その次の音符に対応して「Ａｔａｃｋ−Ｍｉｄ，Ｎｏ−Ｖｉｂ」の文字と共に描かれた奏法記号は、「中程度のアタックで、ビブラートはつけない」奏法を記述している。
２小節目の後半のスラーで結ばれた音符に対応して「Ａｔｋ−Ｆａｓｔ，Ｖｉｂ−Ｓｏｏｎ−Ｆａｓｔ，Ｒｅｌｅａｓｅ−Ｓｍｏｏｔｈｌｙ」の文字で描かれた奏法記号は、「アタックは素速く立上り、ビブラートはすぐに速くし、リリースはスムーズに」という奏法を記述している。
３小節目における楕円の黒丸からなる奏法記号は「テヌート」奏法を示す。また、３小節目には音量を徐々に小さくすることを示す奏法記号や、音の末尾にビブラートをつけることを指示する奏法記号も記載されている。
このように、３小節程度の長さの楽曲フレーズにあっても、多様な奏法若しくは演奏表現すなわちアーティキュレーションが用いられることが理解できる。
なお、これらの奏法記号の表わし方は、これに限るものではなく、要するに奏法を何らかの形で表現しうるものであればよい。或る程度の奏法を表現する記号は従来の楽譜表記においても用いられているが、この発明の実施にあたっては、従来にないより精密な奏法記号を採用することが望ましい。
【００１７】
図１において、次のステップＳ２は、サンプリングした一連の演奏音をその演奏表現上の特徴（すなわちアーティキュレーション）に応じてそれぞれ可変の長さからなる複数の時間区間に分割するステップである。これは、例えばフーリエ解析で知られているような規則的な一定の時間フレーム毎に波形データを分割し分析するようなやり方とは全く異なるものである。すなわち、サンプリングした一連の演奏音の中に存在するアーティキュレーションには多様性があるので、個々のアーティキュレーションに対応する音の時間的範囲は、一様な時間長ではなく、任意の可変の長さからなっている。従って、サンプリングした一連の演奏音をその演奏表現上の特徴（すなわちアーティキュレーション）に応じて複数の時間区間に分割することは、その結果分割された各時間区間の長さは可変的なものとなる。
【００１８】
図２の（ｂ），（ｃ），（ｄ）は、そのような時間区間の分割例を階層的に例示するものである。図２（ｂ）は、比較的大きなアーティキュレーションのかたまり（これを便宜上、「アーティキュレーション大単位」といい、ＡＬ＃１，ＡＬ＃２，ＡＬ＃３，ＡＬ＃４なる記号で示す）に分割する例を示している。このようなアーティキュレーション大単位は、例えば大まかな演奏表現が共通しているフレージングの小単位毎に区分するとよい。図２（ｃ）は、１つのアーティキュレーション大単位（図ではＡＬ＃３）を、更にアーティキュレーション中単位（便宜上、ＡＭ＃１，ＡＭ＃２なる記号で示す）に分割する例を示している。このアーティキュレーション中単位ＡＭ＃１，ＡＭ＃２は、例えば、大まかに１つの音を単位として区分する。図２（ｄ）は、１つのアーティキュレーション中単位（図ではＡＭ＃１，ＡＭ＃２）を、更にアーティキュレーション最小単位（便宜上、ＡＳ＃１〜ＡＳ＃８なる記号で示す）に分割する例を示している。このアーティキュレーション最小単位ＡＳ＃１〜ＡＳ＃８は、音の部分であって演奏表現の異なる個所、典型的にはアタック部、ボディ部（音の定常的な特徴を示す比較的安定した部分）、リリース部、音と音のつながりの部分など、に対応している。
【００１９】
例えば、ＡＳ＃１，ＡＳ＃２，ＡＳ＃３がアーティキュレーション中単位ＡＭ＃１を構成する１つの音（スラーの先行音）のアタック部、第１のボディ部、第２のボディ部にそれぞれ対応し、ＡＳ＃５，ＡＳ＃６，ＡＳ＃７，ＡＳ＃８が次のアーティキュレーション中単位ＡＭ＃２を構成する１つの音（スラーの後続音）第１のボディ部、第２のボディ部、第３のボディ部、リリース部にそれぞれ対応している。第１及び第２のボディ部というように、複数のボディ部がある理由は、同じ音のボディ部であってもアーティキュレーションが異なっている（例えばビブラートの速さ等が変化している）場合があり、そのような場合に対応している。ＡＳ＃４は、スラー変化による音と音のつながりの部分に対応している。この部分ＡＳ＃４は、２つのアーティキュレーション中単位ＡＭ＃１，ＡＭ＃２の切り出し方によっていずれか一方（ＡＭ＃１の終わりの部分又はＡＭ＃２の始まりの部分）から取り出せばよい。あるいは、このようなスラー変化による音と音のつながりの部分ＡＳ＃４は、始めからアーティキュレーション中単位として取り出すようにしてもよい。その場合は、アーティキュレーション大単位ＡＬ＃３は、３つのアーティキュレーション中単位に分割されることになり、真中のアーティキュレーション中単位つまり音と音のつながりの部分は、そのままアーティキュレーション最小単位ＡＳ＃４に相当することになる。このようにスラー変化による音と音のつながりの部分ＡＳ＃４を単独で取り出すようにした場合は、該部分ＡＳ＃４を他の音と音とをつなげる部分にも使用することにより、これらの音をスラーでつなげるようにすることもできる。
【００２０】
図２（ｄ）に示したようなアーティキュレーション最小単位ＡＳ＃１〜ＡＳ＃８が、ステップＳ２の処理で分割される複数の時間区間に相当する。以下では、このようなアーティキュレーション最小単位をアーティキュレーションエレメントとも呼ぶことにする。なお、アーティキュレーション最小単位の分割の仕方は上記例に限らないので、アーティキュレーション最小単位すなわちアーティキュレーションエレメントが必ずしも音の部分のみに対応しているとは限らない。
【００２１】
図１において、次のステップＳ３は、分割した各時間区間（アーティキュレーション最小単位ＡＳ＃１〜ＡＳ＃８すなわちアーティキュレーションエレメント）毎の波形データを所定の複数の楽音要素について分析し、分析した各楽音要素の特性を示すデータを生成するステップである。分析する楽音要素としては、例えば、波形（音色）、振幅（音量）、ピッチ（音高）、時間などの要素がある。これらの楽音要素は、当該時間区間における波形データの構成要素（エレメント）であると共に、当該時間区間におけるアーティキュレーションの構成要素（エレメント）でもある。
次のステップＳ４では、生成した各要素の特性を示すデータをデータベースに蓄積する。データベースでは、蓄積したこれらのデータをテンプレートデータとして、楽音合成に際して、利用可能にする。
これらの楽音要素の分析の仕方の一例を示すと次のようであり、各楽音要素の特性を示すデータ（テンプレートデータ）の一例を示すと図３のようである。また、図２（ｅ）にも、１つのアーティキュレーション最小単位から分析される各楽音要素の種類が例示されている。
【００２２】
▲１▼ 波形（音色）要素については、当該時間区間（アーティキュレーションエレメント）におけるオリジナルのＰＣＭ波形データをそのまま取り出す。これを波形テンプレート（Ｔｉｍｂｒｅテンプレート）としてデータベースに記憶する。この波形（音色）要素を示す記号として、“Ｔｉｍｂｒｅ”を用いることにする。
▲２▼ 振幅（音量）要素については、当該時間区間（アーティキュレーションエレメント）におけるオリジナルのＰＣＭ波形データの音量エンベロープ（時間経過に従う音量振幅変化）を抽出し、振幅エンベロープデータを得る。これを振幅テンプレート（Ａｍｐテンプレート）としてデータベースに記憶する。この振幅（音量）要素を示す記号として、“Ａｍｐ”（Ａｍｐｌｉｔｕｄｅの略）を用いることにする。
▲３▼ ピッチ（音高）要素については、当該時間区間（アーティキュレーションエレメント）におけるオリジナルのＰＣＭ波形データのピッチエンベロープ（時間経過に従うピッチ変化）を抽出し、ピッチエンベロープデータを得る。これをピッチテンプレート（Ｐｉｔｃｈテンプレート）としてデータベースに記憶する。このピッチ要素を示す記号として、“Ｐｉｔｃｈ”を用いることにする。
【００２３】
▲４▼ 時間要素については、当該時間区間（アーティキュレーションエレメント）におけるオリジナルのＰＣＭ波形データの時間長をそのまま用いる。従って、当該区間のオリジナルの時間長（可変値である）を比「１」で示すこととすれば、データベース作成時においてこの時間長をあえて分析・測定する必要はない。その場合、時間要素についてのデータすなわち時間テンプレート（ＴＳＣテンプレート）はどの区間（アーティキュレーションエレメント）でも同じ値“１”であるから、これをテンプレートデータベースにあえて記憶しておかなくてもよい。勿論、これに限らず、この実際の時間長を分析・測定し、これを時間テンプレートデータとしてデータベースに記憶するようにする変形例も実施可能である。
【００２４】
ところで、波形データのオリジナルの時間長を可変制御する技術として、該波形データのピッチに影響を与えることなく該波形データを時間軸方向に伸張または圧縮する制御が、未公開ではあるが、「ＴｉｍｅＳｔｒｅｔｃｈ＆Ｃｏｍｐｒｅｓｓ」制御（略して「ＴＳＣ制御」）として本発明者によって既に提案されている。本実施例においてもそのような「ＴＳＣ制御」を利用するものとしており、時間要素の記号として使用するＴＳＣはこの略号である。楽音合成時において、このＴＳＣ値を“１”に固定せずに、その他の適宜の値に設定することにより、再生波形信号の時間長を可変制御することができる。その場合、そのＴＳＣ値は、時間的に変化する値（例えばエンベロープ等適宜の時間関数）として与えるようにしてもよい。なお、このＴＳＣ制御は、オリジナル波形におけるビブラートやスラー等の特殊な奏法がかけられた部分の時間長を自在に可変制御する場合などに役立てることができる。
【００２５】
以上説明したような処理を、様々な自然楽器について、様々な奏法で（様々な楽曲フレーズについて）、それぞれ行い、各自然楽器毎に多数のアーティキュレーションエレメントについての各楽音要素毎のテンプレートを作成し、これらをデータベースに蓄積する。また、自然楽器に限らず、人の声や雷の音など、自然界に存在する様々な音について、上記のようなサンプリングとアーティキュレーション分析の処理を行い、その結果得られる各要素毎の多様なテンプレートデータをデータベースに蓄積するようにしてよい。勿論、サンプリングのために生演奏するフレーズは、上記例のような数小節からなるフレーズに限らず、必要に応じてもっと短いフレーズ（例えば図２（ｂ）に示したような１つのフレージング小単位）のみであってもよいし、あるいは反対に１つの曲全部であってもよい。
【００２６】
データベースＤＢの構成は、例えば図４に示すように、テンプレートデータベースＴＤＢとアーティキュレーションデータベースＡＤＢとに大別される。なお、データベースＤＢのハードウェアとしては周知のようにハードディスク装置や光磁気ディスク装置などの読み書き可能な記憶媒体（好ましくは大容量媒体）が用いられる。
テンプレートデータベースＴＤＢは、上記のようにして作成された多数のテンプレートデータを蓄積するものである。なお、テンプレートデータベースＴＤＢに記憶するテンプレートデータは、必ずしもその全てが上記のような演奏音又は自然音のサンプリングと分析に基づくものである必要はなく、要するに、テンプレート（出来合いのデータ）として予め用意されたものであればよく、データ編集作業によって人為的に任意に作成したものであってもよい。例えば、時間要素についてのＴＳＣテンプレートは、サンプリングした演奏音に基づくものである限りは上述のように通常は“１”であるが、自由な変化パターン（エンベロープ）で作成することができるものであるから、様々なＴＳＣ値又はその時間的変化のエンベロープ波形をＴＳＣテンプレートデータとして作成し、データベースに記憶させておくようにしてよい。また、テンプレートデータベースＴＤＢに記憶するテンプレートの種類も、上記のようなオリジナル波形から分析した特定の要素に対応するものに限らず、楽音合成の際の便宜を図るためにその他の種類のものを適宜増加してよい。例えば、楽音合成の際にフィルタを使用して音色制御を行う場合、フィルタ係数セット（時変動フィルタ係数セットを含む）をテンプレートデータとして多数用意し、これをテンプレートデータベースＴＤＢに記憶しておくようにしてよい。勿論、このようなフィルタ係数セットは、オリジナル波形の分析に基づき作成するようにしてもよいし、その他適宜の手段で作成するようにしてもよい。
【００２７】
テンプレートデータベースＴＤＢに記憶された各テンプレートデータのデータ構成は、図３に例示したような各テンプレートデータの内容そのものを表わすデータからなる。例えば、波形（Ｔｉｍｂｒｅ）テンプレートは、ＰＣＭ波形データそのものである。また、振幅（Ａｍｐ）エンベロープやピッチ（Ｐｉｔｃｈ）エンベロープ、ＴＳＣエンベロープなどのエンベロープ波形も、そのエンベロープ形状をＰＣＭ符号化したものであってよい。しかし、テンプレートデータベースＴＤＢにおけるエンベロープ波形状のテンプレートのデータ記憶構成を圧縮するために、エンベロープ波形を折線近似するためのパラメータデータ（公知のように各折線の傾きレートと目標レベルあるいは時間等を示すデータのセットからなる）の形式でこれらのテンプレートデータを記憶してもよい。
【００２８】
また、波形（Ｔｉｍｂｒｅ）テンプレートも、ＰＣＭ波形データ以外の適宜のデータ圧縮された形式で記憶するようにしてもよい。また、その他の適宜のデータ形式で波形すなわち音色（Ｔｉｍｂｒｅ）テンプレートデータを記憶するようにしてもよい。すなわち、波形（Ｔｉｍｂｒｅ）テンプレートデータは、例えばＤＰＣＭ又はＡＤＰＣＭ等、ＰＣＭ形式以外のデータ圧縮化したコード化形式からなる波形データであってもよいし、あるいは、波形サンプル値を直接示していない波形形成用データすなわち波形合成用のパラメータ、からなるものであってよい。その種のパラメータによる波形合成方式としては、フーリエ合成あるいはＦＭ（周波数変調）合成あるいはＡＭ（振幅変調）合成あるいは物理モデル音源あるいはＳＭＳ波形合成（確定成分と不確定成分とを用いて波形合成する技術）など、種々知られているので、これらのいずれかの波形合成方式を採用し、そのための波形合成用パラメータを波形（Ｔｉｍｂｒｅ）テンプレートデータとしてデータベースに記憶するようにしてよい。その場合、波形（Ｔｉｍｂｒｅ）テンプレートデータ、つまり波形合成用パラメータ、に基づく波形形成処理は、それに対応する波形合成用の演算装置又はプログラム等によって行われるのは勿論である。その場合、所望形状の波形を形成するための波形合成用パラメータセットを、１つのアーティキュレーションエレメント、つまり時間区間、に対応して複数セット記憶しておき、波形合成に使用するパラメータセットを時間経過に従って切り替えることにより、１アーティキュレーションエレメント内での波形形状の時変動を実現するようにしてもよい。
【００２９】
また、波形（Ｔｉｍｂｒｅ）テンプレートを、ＰＣＭ波形データで記憶する場合であっても、公知のループ読出し技術を採用できる場合（例えばボディ部のように音色波形が安定していて余り時間変化しないような部分についての波形データ）は、当該区間の波形を全部記憶せずにその一部の波形データのみを記憶しておくようにしてよい。また、サンプリングと分析の結果得られた異なる時間区間すなわちアーティキュレーションエレメントについてのテンプレートデータの内容が、同一か似通っている場合は、それぞれのテンプレートデータをデータベースＴＤＢに記憶することなく、１つだけを記憶しておき、楽音合成時にこれを共用することにより、データベースＴＤＢの記憶量を節約することができる。また、テンプレートデータベースＴＤＢの構成は、基本のデータベースの供給者（例えば電子楽器メーカー）が予め作成したプリセット領域と、ユーザーが自由に追加作成できるユーザー領域等を含んでいてもよい。
【００３０】
アーティキュレーションデータベースＡＤＢは、１又は複数のアーティキュレーションを含む演奏を構築するために、アーティキュレーションを記述するデータ（すなわち１又は複数のアーティキュレーションエレメントの組合せによって一連の演奏を記述するデータ及び各アーティキュレーションエレメントを記述するデータ）を、多様な演奏ケース及び奏法に対応して、それぞれ記憶しているものである。
図４のブロック中には、「Ｉｎｓｔｒｕｍｅｎｔ１」と名付けた或る１つの楽器音についてのデータベース構成が例示されている。アーティキュレーション・エレメント・シーケンスＡＥＳＥＱは、１又は複数のアーティキュレーションを含む演奏フレーズ（すなわちアーティキュレーション演奏フレーズ）を、１又は複数のアーティキュレーションエレメントを順次に指示するシーケンスデータの形式で記述するものである。例えば、このアーティキュレーションエレメントシーケンスは、前記サンプリングと分析の工程において分析された図２（ｄ）に示したようなアーティキュレーション最小単位（アーティキュレーションエレメント）の時系列的順序に相当するものである。その楽器音を演奏する場合に有り得る様々な奏法を網羅しうるように、多数のアーティキュレーションエレメントシーケンスＡＥＳＥＱを記憶している。なお、１つのアーティキュレーションエレメントシーケンスＡＥＳＥＱは、図２（ｂ）に示したような「フレージングの小単位」（アーティキュレーション大単位ＡＬ＃１，ＡＬ＃２，ＡＬ＃３，ＡＬ＃４）の１つであってもよいし、若しくはこれらの「フレージングの小単位」（ＡＬ＃１，ＡＬ＃２，ＡＬ＃３，ＡＬ＃４）のいくつかからなっていてもよいし、あるいは図２（ｃ）に示したような「アーティキュレーション中単位」（ＡＭ＃１，ＡＭ＃２）の１つであってもよいし、あるいはこれらの「アーティキュレーション中単位」（ＡＭ＃１，ＡＭ＃２）のいくつかに対応していてもよい。
【００３１】
アーティキュレーション・エレメント・ベクトルＡＥＶＱは、その楽器音（Ｉｎｓｔｒｕｍｅｎｔ１）についてテンプレートデータベースＴＤＢで用意（蓄積）されている全てのアーティキュレーションエレメントについての各楽音要素毎のテンプレートデータのインデックスを、個々のテンプレートを指示するベクトルデータの形式で（例えばテンプレートデータベースＴＤＢから所要のテンプレートを引き出すためのアドレスデータの形式で）、記憶しているものである。例えば、図２（ｄ）（ｅ）の例に示されるように、或るアーティキュレーションエレメントＡＳ＃１に対応して、そのアーティキュレーションエレメントに相当する部分的楽音を構成する各要素（波形、振幅、ピッチ、時間）についての４つのテンプレートＴｉｍｂｒｅ，Ａｍｐ，Ｐｉｔｃｈ，ＴＳＣをそれぞれ具体的に指示するベクトルデータ（これをエレメントベクトルという）を記憶している。
【００３２】
１つのアーティキュレーションエレメントシーケンス（奏法シーケンス）ＡＥＳＥＱにおいては、複数のアーティキュレーションエレメントのインデックスが演奏順に従って記述されており、そこに記述された各アーティキュレーションエレメントを構成するテンプレートのセットは、アーティキュレーションエレメントベクトルＡＥＶＱを参照することにより引き出すことができるようになっている。
図５の（ａ）は、いくつかのアーティキュレーションエレメントシーケンスＡＥＳＥＱ＃１〜ＡＥＳＥＱ＃７の一例を示している。この図の読み方について説明すると、例えば、ＡＥＳＥＱ＃１＝（ＡＴＴ−Ｎｏｒ，ＢＯＤ−Ｖｉｂ−ｎｏｒ，ＢＯＤ−Ｖｉｂ−ｄｅｐ１，ＢＯＤ−Ｖｉｂ−ｄｅｐ２，ＲＥＬ−Ｎｏｒ）は、シーケンス番号１のシーケンスＡＥＳＥＱ＃１は、ＡＴＴ−Ｎｏｒ，ＢＯＤ−Ｖｉｂ−ｎｏｒ，ＢＯＤ−Ｖｉｂ−ｄｅｐ１，ＢＯＤ−Ｖｉｂ−ｄｅｐ２，ＲＥＬ−Ｎｏｒという５つのアーティキュレーションエレメントのシーケンスからなる、ということを示している。各アーティキュレーションエレメントのインデックス記号の意味は次の通りである。
【００３３】
ＡＴＴ−Ｎｏｒは「ノーマルアタック」（アタック部が標準的に立ち上がる奏法）を示す。
ＢＯＤ−Ｖｉｂ−ｎｏｒは「ボディ・ノーマルビブラート」（ボディ部に標準的なビブラートが付けられる奏法）を示す。
ＢＯＤ−Ｖｉｂ−ｄｅｐ１は「ボディ・ビブラートディプス１」（ボディ部に標準よりも１段階深いビブラートが付けられる奏法）を示す。
ＢＯＤ−Ｖｉｂ−ｄｅｐ２は「ボディ・ビブラートディプス２」（ボディ部に標準よりも２段階深いビブラートが付けられる奏法）を示す。
ＲＥＬ−Ｎｏｒは「ノーマルリリース」（リリース部が標準的に立ち下がる奏法）を示す。
【００３４】
従って、シーケンスＡＥＳＥＱ＃１は、ノーマルアタックで始まり、ボディ部では最初はノーマルビブラートがつけられ、次にそのビブラートが少し深くなり、次いでさらにビブラートが深くなり、最後にリリース部では標準的な音の立ち下がりをみせる、というアーティキュレーションからなっている。
例示的に示された他のシーケンスＡＥＳＥＱ＃２〜ＡＥＳＥＱ＃６についても、同様に、図５（ａ）におけるアーティキュレーションエレメントの記号表現から、そのアーティキュレーションが理解できるであろう。
参考のために図５（ａ）に示された他のいくつかのアーティキュレーションエレメントの記号の意味について説明すると次の通りである。
【００３５】
ＢＯＤ−Ｖｉｂ−ｓｐｄ１は「ボディ・ビブラートスピード１」（ボディ部に標準よりも１段階速いビブラートが付けられる奏法）を示す。
ＢＯＤ−Ｖｉｂ−ｓｐｄ２は「ボディ・ビブラートスピード２」（ボディ部に標準よりも２段階速いビブラートが付けられる奏法）を示す。
ＢＯＤ−Ｖｉｂ−ｄ＆ｓ１は「ボディ・ビブラートディプス＆スピード１」（ボディ部に付けるビブラートの深さと速さをそれぞれ標準より１段階上げる奏法）を示す。
ＢＯＤ−Ｖｉｂ−ｂｒｉは「ボディ・ビブラートブリリアント」（ボディ部にビブラートを付け、かつその音色を派手にする奏法）を示す。
ＢＯＤ−Ｖｉｂ−ｍｌｄ１は「ボディ・ビブラートマイルド１」（ボディ部にビブラートを付け、かつその音色を少しマイルドにする奏法）を示す。
ＢＯＤ−Ｃｒｅ−ｎｏｒは「ボディ・ノーマルクレッシェンド」（ボディ部に標準的なクレッシェンドを付ける奏法）を示す。
ＢＯＤ−Ｃｒｅ−ｖｏｌ１は「ボディ・クレッシェンドボリューム１」（ボディ部に付けるクレッシェンドのボリュームを１段階上げた奏法）を示す。
ＡＴＴ−Ｂｕｐ−ｎｏｒは「アタック・ベンドアップノーマル」（アタック部のピッチを標準的な深さと速さでベンドアップする奏法）を示す。
ＲＥＬ−Ｂｄｗ−ｎｏｒは「リリース・ベンドダウンノーマル」（リリース部のピッチを標準的な深さと速さでベンドダウンする奏法）を示す。
【００３６】
従って、シーケンスＡＥＳＥＱ＃２は、ノーマルアタックで始まり、ボディ部では最初はノーマルビブラートがつけられ、次にそのビブラートスピードが少し速くなり、次いでさらにビブラートスピードが速くなり、最後にリリース部では標準的な音の立ち下がりをみせる、という変化を示すアーティキュレーション（奏法）に対応している。
また、シーケンスＡＥＳＥＱ＃３は、ビブラートの深さを徐々に深くすると共に、スピードも徐々に速くする、という変化を示すアーティキュレーション（奏法）に対応している。
また、シーケンスＡＥＳＥＱ＃４は、ビブラート時の波形の音質（音色）を変化させるアーティキュレーション（奏法）に対応している。
シーケンスＡＥＳＥＱ＃５は、クレッシェンドをつけるアーティキュレーション（奏法）に対応している。
シーケンスＡＥＳＥＱ＃６は、アタック部のピッチがベッドアップする（ピッチが徐々に上がる）アーティキュレーション（奏法）に対応している。
シーケンスＡＥＳＥＱ＃７は、リリース部のピッチがベッドダウンする（ピッチが徐々に下がる）アーティキュレーション（奏法）に対応している。
アーティキュレーションエレメントシーケンス（奏法シーケンス）には、上記に限らず、更に多数種類有りうるが、特に詳しく図示しない。
【００３７】
図５の（ｂ）は、いくつかのアーティキュレーションエレメントに関するアーティキュレーションエレメントベクトルＡＥＶＱの構成例を示している。この図の読み方について説明すると、括弧内において、各要素に対応するテンプレートを指示するベクトルデータが記述されている。各ベクトルデータにおいて先頭の記号はそのテンプレートの種類を示している。すなわち、Ｔｉｍｂは波形（Ｔｉｍｂｒｅ）テンプレートであることを示し、Ａｍｐは振幅（Ａｍｐ）テンプレートであることを示し、Ｐｉｔはピッチ（Ｐｉｔｃｈ）テンプレートであることを示し、ＴＳＣは時間（ＴＳＣ）テンプレートであることを示す。
【００３８】
例えば、ＡＴＴ−Ｎｏｒ＝（Ｔｉｍｂ−Ａ−ｎｏｒ，Ａｍｐ−Ａ−ｎｏｒ，Ｐｉｔ−Ａ−ｎｏｒ，ＴＳＣ−Ａ−ｎｏｒ）は、「ノーマルアタック」の意味を持つアーティキュレーションエレメントＡＴＴ−Ｎｏｒは、Ｔｉｍｂ−Ａ−ｎｏｒ（アタック部の標準的な波形テンプレート），Ａｍｐ−Ａ−ｎｏｒ（アタック部の標準的な振幅テンプレート），Ｐｉｔ−Ａ−ｎｏｒ（アタック部の標準的なピッチテンプレート），ＴＳＣ−Ａ−ｎｏｒ（アタック部の標準的なＴＳＣテンプレート）という４つのテンプレートによって波形合成されるものである、ということを示している。
【００３９】
別の例を示すと、「ボディ・ビブラートディプス１」の意味を持つアーティキュレーションエレメントＢＯＤ−Ｖｉｂ−ｄｅｐ１は、Ｔｉｍｂ−Ｂ−ｖｉｂ（ボディ部のビブラート用の波形テンプレート），Ａｍｐ−Ｂ−ｄｐ３（ボディ部のビブラート深さ３用の振幅テンプレート），Ｐｉｔ−Ｂ−ｄｐ３（ボディ部のビブラート深さ３用のピッチテンプレート），ＴＳＣ−Ｂ−ｖｉｂ（ボディ部のビブラート用のＴＳＣテンプレート）という４つのテンプレートによって波形合成される。
更に別の例を示すと、「リリース・ベンドダウンノーマル」の意味を持つアーティキュレーションエレメントＲＥＬ−Ｂｄｗ−ｎｏｒは、Ｔｉｍｂ−Ｒ−ｂｄｗ（リリース部のベンドダウン用の波形テンプレート），Ａｍｐ−Ｒ−ｂｄｗ（リリース部のベンドダウン用の振幅テンプレート），Ｐｉｔ−Ｒ−ｂｄｗ（リリース部のベンドダウン用のピッチテンプレート），ＴＳＣ−Ｒ−ｂｄｗ（リリース部のベンドダウン用のＴＳＣテンプレート）という４つのテンプレートによって波形合成される。
【００４０】
なお、アーティキュレーションの編集を容易にするために、各アーティキュレーションエレメントシーケンスの特徴を概略的に説明する属性情報ＡＴＲを、各アーティキュレーションエレメントシーケンスＡＥＳＥＱに付属して記憶しておくようにするとよい。同様に、各アーティキュレーションエレメントの特徴を概略的に説明する属性情報ＡＴＲを、各アーティキュレーションエレメントベクトルＡＥＶＱに付属して記憶しておくようにするとよい。
要するに、このような属性情報ＡＴＲは、各アーティキュレーションエレメント（図２（ｄ）に示したようなアーティキュレーション最小単位）の特徴を説明するものである。アタック部に関連するアーティキュレーションエレメントを例にして、そのアーティキュレーションエレメントの記号（インデックス）と、それぞれの属性情報ＡＴＲの内容、及び各楽音要素のテンプレートを指示する各ベクトルデータの一例を図６に示す。
【００４１】
図６の例では、属性情報ＡＴＲも階層化されて管理されている。すなわち、アタック部に関連するアーティキュレーションエレメントにはすべて共通の「アタック」という属性情報が付与され、そのうちの標準のエレメントに対しては「ノーマル」という属性情報が更に付与され、また、そのうちのベンドアップ奏法が適用されるエレメントに対しては「ベンドアップ」という属性情報が付与され、ベンドダウン奏法が適用されるエレメントに対しては「ベンドダウン」という属性情報が付与される。更に、ベンドアップ奏法が適用されるエレメントのうち、標準的なものに対しては「ノーマル」という属性情報が付与され、標準よりベンドの深さが浅いものに対しては「ディプス・浅い」という属性情報が付与され、標準よりベンドの深さが深いものに対しては「ディプス・深い」という属性情報が付与され、標準よりベンドのスピードが遅いものに対しては「スピード・遅い」という属性情報が付与され、標準よりベンドのスピードが速いものに対しては「スピード・速い」という属性情報が付与される。図示を省略したが、ベンドダウン奏法が適用されるエレメントに対しても、同様に、更に細分化された属性情報が付与される。
【００４２】
図６においては、また、異なるアーティキュレーションエレメント間においてテンプレートデータが共用されるものがあることが示されている。図６において、奏法の各インデックス（アーティキュレーションエレメントインデックス）の欄に記載された４種のテンプレートのベクトルデータ（換言すればテンプレートインデックス）が、該アーティキュレーションエレメントの部分的音を形成するためのテンプレートを指示するベクトルデータを示しており、この読み方は図５（ｂ）と同様である。ここで、ベンドアップの属性を持つエレメントにおいて、＝記号を記したものは、そのノーマル時のテンプレートと同じものを使用することを意味している。例えば、ベンドアップ奏法用の波形（Ｔｉｍｂｒｅ）テンプレートは、すべてベンドアップノーマル用の波形テンプレートＴｉｍｂ−Ａ−ｂｕｐと同じものを使用する。また、ベンドアップ奏法用の振幅（Ａｍｐ）テンプレートは、すべてベンドアップノーマル用の振幅テンプレートＡｍｐ−Ａ−ｂｕｐと同じものを使用する。これは、ベンドアップ奏法が微妙に変化してもその波形や振幅エンベロープは変えることなく共通のものを使用しても音質上差し支えないからである。これに対して、ピッチ（Ｐｉｔｃｈ）テンプレートは、ベンドアップ奏法におけるディプスの程度に合わせて異なるものを使用しなければならない。例えば、「ディプス・浅い」の属性を持つアーティキュレーションエレメントＡＴＴ−Ｂｕｐ−ｄｐ１においては、それに相応するピッチ（Ｐｉｔｃｈ）テンプレート（浅いベンドアップ特性に対応するピッチエンベロープのテンプレート）を指示するために、浅いベンドアップ特性に対応するピッチエンベロープのテンプレートを指示するベクトルデータＰｉｔ−Ａ−ｄｐ１が使用される。
【００４３】
このようにテンプレートデータの共用化を図ることによりテンプレートデータベースＴＤＢの記憶量を節約することができる。また、データベース作成時において、すべての奏法について生演奏を録音する必要がない。
なお、図６を参照すると、ベンドアップ奏法のスピードは、時間（ＴＳＣ）テンプレートを異ならせることによって調整されることが理解できる。ピッチベンドのスピードは、所定の初期ピッチから目標ピッチまで到達するのに要する時間に対応しているから、オリジナルの波形データが所定のピッチベンド特性（或る時間内に所定の初期ピッチから目標ピッチまでベンドするという特性）を持っている場合、そのオリジナルの波形データの時間長をＴＳＣ制御によって可変制御すれば、初期ピッチから目標ピッチまで到達するのに要する時間つまりベンドのスピードを調整することができる。このような時間（ＴＳＣ）テンプレートによる波形時間長可変制御は、楽音立ち上がりのスピードや、スラーのスピード、ビブラートのスピードなど、各種奏法のスピードの調整に適している。例えば、スラーにおけるピッチの変化は、ピッチ（Ｐｉｔｃｈ）テンプレートによっても実現することができるが、時間（ＴＳＣ）テンプレートを用いてＴＳＣ制御を行った方が自然なスラー変化を実現することができる。
【００４４】
アーティキュレーションデータベースＡＤＢにおけるアーティキュレーションエレメントベクトルＡＥＶＱは、アーティキュレーションエレメントインデックスによってアドレッシングされることができるのは勿論であり、また、属性情報ＡＴＲによってアドレッシングされることができるものとする。これによって、所望の属性情報ＡＴＲをキーワードとしてアーティキュレーションデータベースＡＤＢに検索をかけることにより、該キーワードに該当する属性を持つアーティキュレーションエレメントとしてどのようなものがあるかを検索することができ、ユーザーによるデータ編集作業に便利である。このような属性情報ＡＴＲは、アーティキュレーションエレメントシーケンスＡＥＳＥＱにも付加しておくとよい。これによって、所望の属性情報ＡＴＲをキーワードとしてアーティキュレーションデータベースＡＤＢに検索をかけることにより、該キーワードに該当する属性を持つアーティキュレーションエレメントを含んでいるアーティキュレーションエレメントシーケンスＡＥＳＥＱを検索することができる。
なお、アーティキュレーションデータベースＡＤＢにおけるアーティキュレーションエレメントベクトルＡＥＶＱをアドレッシングするためのアーティキュレーションエレメントインデックスは、アーティキュレーションエレメントシーケンスＡＥＳＥＱの読出しに従って与えられるようになっているのは勿論であるが、編集作業のためにあるいはリアルタイムの自由な音作りのために、所望のアーティキュレーションエレメントインデックスを単独でアドレス入力してもよいようにするのがよい。
【００４５】
アーティキュレーションデータベースＡＤＢにおいては、ユーザーが所望のアーティキュレーションエレメントシーケンスを作成しこれを記憶保存しておくことができるように、ユーザーアーティキュレーションエレメントシーケンスＵＲＳＥＱを記憶するエリアも有している。このようなユーザーエリアにおいては、ユーザーが作成したアーティキュレーションエレメントベクトルデータをも記憶しておくようにしてよい。
アーティキュレーションデータベースＡＤＢにおいては、アーティキュレーションエレメントベクトルＡＥＶＱの下位のベクトルデータとしてパーシャルベクトルＰＶＱを記憶している。アーティキュレーションエレメントベクトルＡＥＶＱで指定されたテンプレートデータが、テンプレートデータベースＴＤＢにおいて当該アーティキュレーションエレメントの全時間区間のデータとしてではなく、一部のデータとして記憶されている場合、この一部のデータからなるテンプレートデータをループ読出し（繰り返し読出し）して当該アーティキュレーションエレメントの全時間区間のデータを再生するようになっている。そのようなループ読出しに必要なデータがパーシャルベクトルＰＶＱとして記憶されている。その場合、例えば、アーティキュレーションエレメントベクトルＡＥＶＱには、上記各テンプレートデータのほかにパーシャルベクトルＰＶＱを指示するデータを記憶しており、このパーシャルベクトル指示データによってパーシャルベクトルＰＶＱのデータを読み出し、このパーシャルベクトルＰＶＱのデータによってループ読出しを制御する。従って、パーシャルベクトルＰＶＱは、ループ読出し制御のために必要なループ開始アドレスやループ終了アドレス等を指示するデータを含んでいる。
【００４６】
更に、アーティキュレーションデータベースＡＤＢにおいては、楽音合成時において時間的に隣接するアーティキュレーションエレメント間での波形データの接続の際のルールを記述したルールデータＲＵＬＥを記憶している。例えば、時間的に隣接するアーティキュレーションエレメント間で波形のクロスフェード補間を行って滑らかに接続するとか、クロスフェード補間を行わずに直接的に接続するとか、あるいはクロスフェード波形補間をおこう場合にどのようなクロスフェード法を使用するか、等のルールを、各シーケンスに対応して、あるいはシーケンス内の各アーティキュレーションエレメントに対応して、記憶している。この接続ルールも、ユーザーによるデータ編集の対象とすることができる。
アーティキュレーションデータベースＡＤＢにおいては、以上例示的に説明したようなデータ構成からなるアーティキュレーションデータベースを各楽器音（自然楽器音色）毎に設け、また、各種の人声音（若い女性の声、若い男性の声、バリトン、ソプラノ等）毎に設け、また、各種の自然音（雷の音、波の音等々）毎に、等々、各種設ける。
【００４７】
〔楽音合成の概略〕
上記のようにして作成されたデータベースＤＢを利用して楽音を合成する手順の概略を図７に示す。
まず、発生しようとする楽音演奏（複数音からなる演奏フレーズ又は１音でもよい）に対応する所要の奏法シーケンスを指示する（ステップＳ１１）。この奏法シーケンスの指示は、アーティキュレーションデータベースＡＤＢに記憶されている所望の楽器音（又は人声音又は自然音等）のアーティキュレーションエレメントシーケンスＡＥＳＥＱ又はＵＲＳＥＱの１つを選択的に指示することからなっていてよい。
【００４８】
このような奏法シーケンス（すなわちアーティキュレーションエレメントシーケンス）の指示は、ユーザーによるリアルタイム演奏操作に基づいて与えることができるようになっていてもよいし、あるいは自動演奏データに基づいて与えることができるようになっていてもよい。前者の場合は、例えば、鍵盤やその他の演奏操作子に対して各種の奏法シーケンスを予め割り当てておき、該操作子の操作に応じてそこに割り当てられている奏法シーケンス指示データを発生するようにすることができる。後者の場合、１つの手法として、図８の（ａ）に略示するように、所望の楽曲に対応するＭＩＤＩ形式等の自動演奏シーケンスデータの中にイベントデータとして奏法シーケンス指示データをそれぞれ組み込んで記憶しておき、自動演奏再生時に所定の各イベント再生時点で各奏法シーケンス指示データが読み出されるようにすることができる。なお、図８で、ＤＵＲは次のイベントまでの時間間隔を示すデュレーションデータ、ＥＶＥＮＴはイベントデータ、ＭＩＤＩは当該イベントデータに付属する演奏データがＭＩＤＩ形式のデータであること、ＡＥＳＥＱは当該イベントデータに付属する演奏データが奏法シーケンス指示データであること、を示す。この場合は、ＭＩＤＩ形式等の自動演奏データに基づく自動演奏と本発明に従う奏法シーケンスに基づく自動演奏とのアンサンブルを行うことができる。その場合、例えば、メインのソロ若しくはメロディ演奏楽器パートを本発明に従う奏法シーケンスすなわちアーティキュレーションエレメント合成で演奏し、他の楽器パートをＭＩＤＩデータに基づく自動演奏で行う、といった形態をとることができる。
【００４９】
また、後者の別の手法として、図８の（ｂ）に略示するように、所望の楽曲に対応して複数の奏法シーケンス指示データＡＥＳＥＱのみをイベントデータ形式で記憶しておき、これを所定の各イベント再生時点で読み出すようにしてもよい。これによって、従来にはなかった、楽曲のアーティキュレーションシーケンス自動演奏を行うことができる。
更に、後者の別の手法として、所望の楽曲に対応するＭＩＤＩ形式等の自動演奏シーケンスデータのみを記憶しておき、この自動演奏シーケンスデータを演奏解釈プログラムによって分析することにより、各フレーズ又は音符毎の奏法すなわちアーティキュレーションを自動的に解析し、この解析結果として奏法シーケンス指示データを発生するようにしてもよい。
また、奏法シーケンスの別の指示方法としては、ユーザーが所望の１又は複数の属性情報を入力し、これをキーワードとしてアーティキュレーションデータベースＡＤＢに検索を掛けることにより、１又は複数のアーティキュレーションエレメントシーケンスＡＥＳＥＱを自動的にリストアップし、その中から所望のシーケンスを選択指定するようにしてもよい。
【００５０】
図７において、選択されたアーティキュレーションエレメントシーケンスＡＥＳＥＱ又はＵＲＳＥＱにおいては、所定の演奏順序に従ってアーティキュレーションエレメント（ＡＥ）インデックスを読み出す（ステップＳ１２）。
そして、読み出されたアーティキュレーションエレメント（ＡＥ）インデックスに対応するアーティキュレーションエレメントベクトル（ＡＥＶＱ）を読み出す（ステップＳ１３）。
そして、読み出されたアーティキュレーションエレメントベクトル（ＡＥＶＱ）によって指示された各テンプレートデータをテンプレートデータベースＴＤＢから読み出す（ステップＳ１４）。
【００５１】
そして、読み出された各テンプレートデータに従って１つのアーティキュレーションエレメント（ＡＥ）の波形データ（部分的音）を合成する（ステップＳ１５）。この波形合成の仕方は、基本的には、波形（Ｔｉｍｂｒｅ）テンプレートデータに該当するＰＣＭ波形データをテンプレートデータベースＴＤＢからピッチ（Ｐｉｔｃｈ）テンプレートに従う読み出し速度でかつ時間（ＴＳＣ）テンプレートに従う時間長で読み出し、読み出したＰＣＭ波形データの振幅エンベロープを振幅（Ａｍｐ）テンプレートに従って制御することからなる。なお、この実施例では、テンプレートデータベースＴＤＢに記憶する波形（Ｔｉｍｂｒｅ）テンプレートデータはサンプリングしたオリジナル波形のピッチと振幅エンベロープ及び時間長をそのまま持っているものとしているので、ピッチ（Ｐｉｔｃｈ）テンプレート、振幅（Ａｍｐ）テンプレート、時間（ＴＳＣ）テンプレートのそれぞれがサンプリングしたオリジナル波形のものから変更されていない場合は、テンプレートデータベースＴＤＢに記憶されている波形（Ｔｉｍｂｒｅ）テンプレートデータに対応するＰＣＭ波形データをそのまま読み出したものが当該アーティキュレーションエレメントについての波形データとなる。追って説明するデータ編集等によって、ピッチ（Ｐｉｔｃｈ）テンプレート、振幅（Ａｍｐ）テンプレート、時間（ＴＳＣ）テンプレートのいずれかが、サンプリングしたオリジナル波形のものから変更された場合は、その変化分に応じて、テンプレートデータベースＴＤＢに記憶されている波形（Ｔｉｍｂｒｅ）テンプレートデータの読み出し速度が可変制御されたり（ピッチテンプレートが変更された場合）、その読み出し時間長が可変制御されたり（時間テンプレートが変更された場合）、読み出し波形に対する振幅エンベロープが可変制御されたり（振幅テンプレートが変更された場合）する。
なお、当該アーティキュレーションエレメントＡＥについて前述のパーシャルベクトルＰＶＱが適用される場合は、必要なループ読み出し制御もなされる。
【００５２】
次に、以上のように波形合成された各アーティキュレーションエレメントの波形データを順次接続する処理が行われ、その結果、複数のアーティキュレーションエレメントの時系列的組み合わせからなる一連の演奏音が発生される（ステップＳ１６）。ここでの接続処理は、アーティキュレーションデータベースＡＤＢに記憶されているルールデータＲＵＬＥに従って制御される。例えば、ルールデータＲＵＬＥが直接接続を指示している場合は、ステップＳ１５で合成された各アーティキュレーションエレメントの波形データをただその発生順序に従って順次切り換えて発音するだけでよい。また、ルールデータＲＵＬＥが所定のクロスフェード補間を指示している場合は、指示された補間形式に従って、先行するアーティキュレーションエレメントの終わりの部分の波形データと後続するアーティキュレーションエレメントの始まりの部分の波形データとをクロスフェード補間合成し、波形が滑らかにつながるようにする。例えば、サンプリングしたオリジナル波形そのままに接続される場合は、元々各アーティキュレーションエレメント同士は滑らかにつながることが保証されているので、ルールデータＲＵＬＥは直接接続を指示していてよい。それ以外の場合は、アーティキュレーションエレメント同士が滑らかにつながることは保証されていないので、何らかの補間合成を行うのがよい。後述するように、複数種のクロスフェード補間形式のいずれかをルールデータＲＵＬＥによって任意に選択することができるようになっている。
【００５３】
ステップＳ１１〜Ｓ１６に略示したような一連の演奏音合成処理は、１つの楽器音（又は人声音又は自然音）について１つの楽音合成チャンネルで行われる。複数の楽器音（又は人声音又は自然音）についての演奏音合成処理を同時並行的に行う場合は、ステップＳ１１〜Ｓ１６に略示したような一連の演奏音合成処理を複数チャンネルで時分割的に又は並列的に行うようにすればよい。なお、後述するように、クロスフェード合成処理を用いて楽音波形を形成する場合は、１つの楽音合成チャンネルにつき、２つの波形発生チャンネル（フェードアウトする波形を発生するチャンネルと、フェードインする波形を発生するチャンネル）を使用する。
【００５４】
図９は、いくつかの奏法シーケンスについて、該シーケンスにおけるアーティキュレーションエレメントの組合せ例を略示するものである。（ａ）に示す奏法シーケンス＃１は、最も単純な組合せ例を示しており、アタック部のアーティキュレーションエレメントＡ＃１、ボディ部のアーティキュレーションエレメントＢ＃１、リリース部のアーティキュレーションエレメントＲ＃１が順次接続されてなるものであり、各エレメント間の接続部分はクロスフェード補間されるようになっている。（ｂ）に示す奏法シーケンス＃２は、主要音の前に装飾音が付加されるアーティキュレーション組合せ例を示しており、装飾音用のアタック部のアーティキュレーションエレメントＡ＃２、装飾音用のボディ部のアーティキュレーションエレメントＢ＃２、主要音用のアタック部のアーティキュレーションエレメントＡ＃３、主要音用のボディ部のアーティキュレーションエレメントＢ＃３、主要音用のリリース部のアーティキュレーションエレメントＲ＃３が順次接続されてなるものであり、各エレメント間の接続部分はクロスフェード補間される。（ｃ）に示す奏法シーケンス＃３は、先行音と後続音がスラーで結ばれるアーティキュレーション組合せ例を示しており、先行音用のアタック部のアーティキュレーションエレメントＡ＃４、先行音用のボディ部のアーティキュレーションエレメントＢ＃４、スラー用部分音のボディ部のアーティキュレーションエレメントＢ＃５、後続音用のボディ部のアーティキュレーションエレメントＢ＃６、後続音用のリリース部のアーティキュレーションエレメントＲ＃６が順次接続されてなるものであり、各エレメント間の接続部分はクロスフェード補間される。なお、図において、各アーティキュレーションエレメントに対応する部分音波形は、便宜上、エンベロープのみで略示されているが、実際は、上述のように波形（Ｔｉｍｂｒｅ），振幅（Ａｍｐ），ピッチ（Ｐｉｔｃｈ），時間（ＴＳＣ）の各テンプレートデータに基づいて合成された波形データからなっている。
【００５５】
図１０は、１つの楽音合成チャンネルにおいて、複数のアーティキュレーションエレメントに対応する部分音波形を順次発生しクロスフェード接続する処理の具体例を示すタイムチャートである。１つの楽音合成チャンネルにつき、２つのエレメント波形をクロスフェード合成するために、具体的には２つの波形発生チャンネルを使用する。図１０（ａ）は第１の波形発生チャンネルでの波形発生例を示し、（ｂ）は第２の波形発生チャンネルでの波形発生例を示す。（ａ）及び（ｂ）において、夫々の上段に示された「合成された波形データ」とは、当該アーティキュレーションエレメントに対応する部分音波形として上述のように波形（Ｔｉｍｂｒｅ），振幅（Ａｍｐ），ピッチ（Ｐｉｔｃｈ），時間（ＴＳＣ）等の各テンプレートデータに基づいて合成された波形データ（例えば図７のステップＳ１５で合成される波形データ）を示しており、それぞれの下段に示された「クロスフェード制御波形」とは、各エレメントに対応する部分音波形同士をクロスフェード接続するために使用される制御波形を示している。この「クロスフェード制御波形」は、例えば図７のフローでは、ステップＳ１６の処理の過程で形成される。それぞれのチャンネルの下段のクロスフェード制御波形によって上段のエレメント波形データの振幅を制御し、各チャンネル（第１及び第２の波形発生チャンネル）のクロスフェード振幅制御済みの波形データを加算することにより、クロスフェード合成が完了する。
【００５６】
１つの奏法シーケンスを開始するとき、シーケンススタートトリガＳＳＴが与えられ、これに応じて該シーケンスの最初のアーティキュレーションエレメント（仮にＡ＃１とする）に対応する部分音波形の合成が開始される。すなわち、当該アーティキュレーションエレメントについての波形（Ｔｉｍｂｒｅ），振幅（Ａｍｐ），ピッチ（Ｐｉｔｃｈ），時間（ＴＳＣ）等の各テンプレートデータに基づいて波形データを合成する。よって、図において、「合成された波形データ」は単純にブロックで示されているが、実際は、波形（Ｔｉｍｂｒｅ）テンプレートデータに対応する波形と、振幅（Ａｍｐ）テンプレートデータに対応する振幅エンベロープと、ピッチ（Ｐｉｔｃｈ）テンプレートデータに対応するピッチとその時間的変化と、時間（ＴＳＣ）テンプレートデータに対応する時間長とを有している。
クロスフェード制御波形の立ち上がりは、シーケンスの最初のアーティキュレーションエレメント波形については、図示のようにフルレベルですぐに立ち上がるようにしてよい。しかし、もし、その前のシーケンスの演奏音の末尾の波形とクロスフェード合成したいならば、シーケンスの最初のクロスフェード制御波形の立ち上がりに適当な傾きのフェードイン特性をもたせればよい。このフェードインの傾きはフェードインレートＦＩＲ＃１によって設定される。
【００５７】
シーケンスの最初のアーティキュレーションエレメントＡ＃１に対応して、接続制御情報として、上記フェードインレートＦＩＲ＃１と、ネクストチャンネルスタートポイント情報ＮＣＳＰ＃１と、フェードアウトスタートポイント情報ＦＯＳＰ＃１と、フェードアウトレートＦＯＲ＃１とを有している。ネクストチャンネルスタートポイント情報ＮＣＳＰ＃１は、次のアーティキュレーションエレメント（例えばＢ＃１とする）の波形発生を開始するポイントを指示する。フェードアウトスタートポイント情報ＦＯＳＰ＃１は、自らの波形のフェードアウトを開始するポイントを指示する。図示のように、クロスフェード制御波形は、フェードアウトスタートポイントまではフラットにフルレベルを指示しているが、フェードアウトスタートポイント以降は、設定されたフェードアウトレートＦＯＲ＃１に従う傾きで、そのレベルが徐々に立ち下がる。なお、このエレメントＡ＃１に対応する前記ルールデータＲＵＬＥが、クロスフェード接続をしない直接接続を指示している場合は、これらの情報ＮＣＳＰ＃１，ＦＯＳＰ＃１は、合成された当該アーティキュレーションエレメント波形の末尾を指示するようになっていてよい。しかし、対応するルールデータＲＵＬＥが、クロスフェード接続をしない直接接続を指示している場合は、これらの情報ＮＣＳＰ＃１，ＦＯＳＰ＃１は、図示のように、当該アーティキュレーションエレメント波形の末尾よりも前の適切に設定されたポイントをそれぞれ指示する。従って、これらの情報ＮＣＳＰ＃１，ＦＯＳＰ＃１，ＦＩＲ＃１，ＦＯＲ＃１が当該エレメントＡ＃１についてのルールデータＲＵＬＥに含まれていると考えてよい。なお、これらの接続制御情報は、各アーティキュレーションエレメント毎に夫々設けられている。
【００５８】
図１０（ａ）に示す第１の波形発生チャンネルにおけるエレメント波形Ａ＃１の発生プロセスが、ネクストチャンネルスタートポイント情報ＮＣＳＰ＃１で指示されるポイントに到ると、ネクストチャンネルスタートトリガＮＣＳ＃１が図１０（ｂ）に示す第２の波形発生チャンネルに対して与えられ、該第２の波形発生チャンネルにおいて２番目のアーティキュレーションエレメントＢ＃１に対応する部分音波形の発生を開始する。また、該アーティキュレーションエレメントＢ＃１に対応するクロスフェード制御波形が、それに対応するフェードインレートＦＩＲ＃２によって設定された傾きでフェードインする（徐々に立ち上がる）。こうして、先行するアーティキュレーションエレメントＡ＃１のフェードアウト期間と、後続するアーティキュレーションエレメントＢ＃１のフェードイン期間とが重複し、両者を加算することによりクロスフェード合成が完成する。
先行するアーティキュレーションエレメントＡ＃１の波形データがフェードアウトした後は、後続するアーティキュレーションエレメントＢ＃１のみとなる。こうして、先行するアーティキュレーションエレメントＡ＃１から後続するアーティキュレーションエレメントＢ＃１へとクロスフェードされて波形が滑らかに接続される。
【００５９】
図１０（ｂ）に示す第２の波形発生チャンネルにおけるエレメント波形Ｂ＃１の発生プロセスが、フェードアウトスタートポイント情報ＦＯＳＰ＃２で指示されるポイントに到ると、図示のように、クロスフェード制御波形は、設定されたフェードアウトレートＦＯＲ＃２に従う傾きで、そのレベルが徐々に立ち下がる。また、エレメント波形Ｂ＃１の発生プロセスが、ネクストチャンネルスタートポイント情報ＮＣＳＰ＃２で指示されるポイントに到ると、ネクストチャンネルスタートトリガＮＣＳ＃２が図１０（ａ）に示す第１の波形発生チャンネルに対して与えられ、該第１の波形発生チャンネルにおいて３番目のアーティキュレーションエレメントＲ＃１に対応する部分音波形の発生を開始する。また、該アーティキュレーションエレメントＲ＃１に対応するクロスフェード制御波形が、それに対応するフェードインレートＦＩＲ＃３によって設定された傾きでフェードインする（徐々に立ち上がる）。こうして、先行するアーティキュレーションエレメントＢ＃１のフェードアウト期間と、後続するアーティキュレーションエレメントＲ＃１のフェードイン期間とが重複し、両者を加算することによりクロスフェード合成が完成する。
以下、同様に、順次クロスフェードしながら、各アーティキュレーションエレメントがシーケンスの時系列順に接続される。
【００６０】
なお、上記の例では、各テンプレートに基づいて合成したエレメント波形に対してクロスフェード合成を行うようにしている。しかし、これに限らず、各テンプレートデータ毎にクロスフェード処理を行い、クロスフェード処理済みのテンプレートデータに基づき各エレメント波形の合成を行うようにしてもよい。その場合は、同じエレメントであっても、各テンプレート毎に異なる接続ルールを適用するようにすることができる。すなわち、上記の各接続制御情報（フェードインレートＦＩＲ，ネクストチャンネルスタートポイントＮＣＳＰ，フェードアウトスタートポイントＦＯＳＰ，フェードアウトレートＦＯＲ）が、当該エレメントの波形（Ｔｉｍｂｒｅ），振幅（Ａｍｐ），ピッチ（Ｐｉｔｃｈ），時間（ＴＳＣ）等の各楽音要素に対応するテンプレート毎に夫々用意される。このようにすれば、各テンプレート毎にそれに応じた最適の接続ルールに従ってクロスフェード接続を行うことができ、効果的である。
【００６１】
〔編集〕
図１１は、データ編集処理の一例を模式的に示すものである。図１１においては、アタック部の属性を持つ或るアーティキュレーションエレメントＡ＃１と、ボディ部の属性を持つ或るアーティキュレーションエレメントＢ＃１と、リリース部の属性を持つ或るアーティキュレーションエレメントＲ＃１とからなるアーティキュレーションエレメントシーケンスＡＥＳＥＱ＃ｘのデータを基にして編集を行う例を示している。勿論、ここで述べるデータ編集を実施するにあたっては、所要の編集プログラムをコンピュータが実行し、ディスプレイに表示される各種データの状態を見ながら、キーボードやマウスによってユーザーが所望の操作を行う、というような適当な実現手段を用いて実施される。
基となるシーケンスＡＥＳＥＱ＃ｘは、アーティキュレーションデータベースＡＤＢに記憶されている多数のシーケンスＡＥＳＥＱ（例えば図５（ａ）参照）から選択することができる。アーティキュレーションデータの編集は、大別すると、シーケンス内におけるアーティキュレーションエレメントの差し替えあるいは追加又は削除と、エレメント内におけるテンプレートの差し替えあるいは既存テンプレートのデータ値修正による新規テンプレートの作成とを含む。
【００６２】
図１１の編集の欄には、基となるシーケンスＡＥＳＥＱ＃ｘにおけるリリース部のアーティキュレーションエレメントＲ＃１が比較的なだらかに立ち下がる振幅エンベロープ特性を持っており、これを比較的素速く立ち下がる振幅エンベロープ特性を持つエレメントＲ＃ｘに差し替える例が示されている。差し替えに限らず、所望のエレメントの追加（例えばボディ部エレメントの追加あるいは装飾音用のエレメントの追加など）や削除（ボディ部が複数ある場合はそのうちいずりかを削除することなど）も可能である。差し替えに使用するエレメントＲ＃ｘは、アーティキュレーションデータベースＡＤＢに記憶されている多数のアーティキュレーションエレメントベクトルＡＥＶＱ（例えば図５（ｂ）参照）から選択することができる。その場合、属性情報ＡＴＲを参照して同じ属性のエレメント群の中から、差し替えに使用する所望のエレメントＲ＃ｘを、選択することができる。
【００６３】
次に、所望のエレメント（例えば差し替えたエレメントＲ＃ｘ）の中の所望の楽音要素に対応するテンプレートデータを該楽音要素に関する別のテンプレートデータに差し替える。図１１の例では、エレメントＲ＃ｘのピッチ（Ｐｉｔｃｈ）テンプレートを別のピッチテンプレートＰｉｔｃｈ’（例えばピッチベンド特性を持つピッチテンプレート）に差し替えることが示されている。これにより、作成された新たなリリース部のエレメントＲ＃ｘ’は、比較的素速く立ち下がる振幅エンベロープ特性を持つと共にピッチベンドダウン特性を持つものとなる。なお、テンプレートの差し替えの場合も、属性情報ＡＴＲを参照して、多数のアーティキュレーションエレメントベクトルＡＥＶＱ（例えば図５（ｂ））における同じ属性のエレメント群の各テンプレート（ベクトルデータ）の中から、差し替えに使用する所望のテンプレート（ベクトルデータ）を、選択することができる。
なお、一部のテンプレートの差し替えによって作成された新たなエレメントＲ＃ｘ’は、新たなインデックスと所要の属性情報を付与して、アーティキュレーションデータベースＡＤＢのアーティキュレーションエレメントベクトルＡＥＶＱ（図４参照）のエリアに追加登録するとよい。
【００６４】
所望のテンプレートの具体的データ内容を修正することも可能である。その場合は、編集中のエレメントについての所望のテンプレートの具体的データ内容をテンプレートデータベースＴＤＢから読み出し、これをディスプレイ等で表示してキーボードやマウス等の操作によってそのデータ内容を適宜変更する。所望のデータ修正が終了すると、該修正されたテンプレートデータに新たなインデックスを付けてテンプレートデータベースＴＤＢに追加登録すると共に、該修正されたテンプレートデータに対して新たなベクトルデータを割り当て、この新たなベクトルデータを含む新たなエレメント（例えばＲ＃ｘ’）に対して新たなインデックスと所要の属性情報を付与してアーティキュレーションデータベースＡＤＢのアーティキュレーションエレメントベクトルＡＥＶＱ（図４参照）のエリアに追加登録するようにするとよい。
【００６５】
以上のようにして、基となるシーケンスＡＥＳＥＱ＃ｘの内容を適宜変更して新たなシーケンスデータを作成するデータ編集処理を行うことができる。このようなデータ編集処理によって作成された新たなシーケンスデータは、ユーザーアーティキュレーションエレメントシーケンスＵＲＳＥＱとして新たなシーケンス番号（例えばＵＲＳＥＱ＃ｘ）と属性情報を付与し、アーティキュレーションデータベースＡＤＢに登録する。以後、楽音合成時には、そのシーケンス番号ＵＲＳＥＱ＃ｘを用いてアーティキュレーションデータベースＡＤＢからユーザーアーティキュレーションエレメントシーケンスＵＲＳＥＱのデータを読み出すことができる。
なお、データ編集の形態は図１１で例示したものに限らず、種々の形態があり得る。例えば、基となるシーケンスＡＥＳＥＱを呼び出すことなく、所望のエレメントをエレメントベクトルＡＥＶＱから順次選択し、これによってユーザーシーケンスＵＲＳＥＱを作り上げるようにしてもよい。
【００６６】
図１２は、上述したようなデータ編集処理を実行しうるコンピュータプログラムの概略を示すフロー図である。
ステップＳ２１では、所望の奏法を指定する。この指定は、コンピュータのキーボードやマウスを用いて、シーケンスＡＥＳＥＱ又はＵＲＳＥＱの番号を直接入力するようにしてもよいし、所望の楽器音色と属性情報を入力することによって行うようにしてもよい。
次のステップＳ２２では、指定された奏法に一致するシーケンスがアーティキュレーションデータベースＡＤＢ内のＡＥＳＥＱ又はＵＲＳＥＱに存在しているかどうかを検索し、該当するシーケンスＡＥＳＥＱ又はＵＲＳＥＱを選択する。この場合、シーケンスＡＥＳＥＱ又はＵＲＳＥＱの番号を直接入力した場合は、該当するものが直接引き出される。属性情報を入力した場合は、該属性情報に該当するシーケンスＡＥＳＥＱ及び／又はＵＲＳＥＱが検索される。属性情報は複数入力可能であり、複数入力した場合は、例えばＡＮＤ論理で検索することとすればよい。勿論、これに限らずＯＲ論理で検索してもよい。検索結果はコンピュータのディスプレイで表示し、複数のシーケンスＡＥＳＥＱ及び／又はＵＲＳＥＱが検索された場合は、そのうち所望のものを選択できるようにする。
【００６７】
ステップＳ２３では編集作業を続行するか否かをユーザーに問い合わせし、ＮＯ（続行しない）であれば、出口に行き、編集処理を終了する。ステップＳ２２で選択又は検索されたシーケンスの内容が望み通りのものであり、編集の必要がない場合は、編集処理を終了する。編集処理を続行したい場合は、ステップＳ２３でＹＥＳとし、ステップＳ２４に行く。また、ステップＳ２２で指定された奏法に該当するものが検索できなかった場合も、ステップＳ２３で続行ＹＥＳと判定し、ステップＳ２４に行く。
属性情報による検索の一例を図５及び図６のようなデータがアーティキュレーションデータベースＡＤＢに記憶されている場合を例にして説明する。例えば、アーティキュレーションシーケンスの検索条件の属性として、「アタック・ベンドアップ・ノーマル」と、「ボディ・ノーマル」と、「リリース・ノーマル」が入力されたとする。この場合、図５（ａ）に示された６番目のシーケンスＡＥＳＥＱ＃６の属性に一致するので、ステップＳ２２でシーケンスＡＥＳＥＱ＃６が検索され、選択される。これで満足であれば、ステップＳ２３でＮＯとして、編集処理を終了する。編集処理を続行したければ、ステップＳ２３でＹＥＳとして、ステップＳ２４に行く。
【００６８】
ステップＳ２４では、ステップＳ２１で指定した奏法に該当するシーケンスがまだ選択されていないならば、それに一番近いシーケンスを選択する。例えば、アーティキュレーションシーケンスの検索条件の属性として、前記ステップＳ２１で「アタック・ベンドアップ・ノーマル」と、「ビブラート・ノーマル」と、「リリース・ノーマル」が入力されたとする。シーケンスＡＥＳＥＱが図５（ａ）に示す７種類しかないとすると、これを満足するシーケンスは検索できず、ステップＳ２４でそれに一番近いシーケンスＡＥＳＥＱ＃６が選択される。
ステップＳ２５では、選択されたシーケンスにおける所望のアーティキュレーションエレメント（ＡＥ）を指示するベクトルデータ（インデックス）を別のアーティキュレーションエレメントを指示するベクトルデータ（インデックス）に差し替える処理を行う。例えば、上記例の場合、ステップＳ２４で一番近いシーケンスとして選択されたシーケンスＡＥＳＥＱ＃６のエレメント構成は、ＡＴＴ−Ｎｏｒ，ＢＯＤ−Ｎｏｒ，ＲＥＬ−Ｎｏｒという３つのエレメントベクトルからなっているので（図５（ａ）参照）、ボディ部用のエレメントＢＯＤ−Ｎｏｒ（ノーマルボディ）をビブラート用のボディ部のエレメントに差し替えればよい。そのために、アーティキュレーションエレメントベクトルＡＥＶＱ（例えば図５（ｂ））を参照して、ＢＯＤ−Ｖｉｂ−ｎｏｒ（ボディ・ノーマルビブラート）のエレメントベクトルデータ（インデックス）を引き出して、これをＢＯＤ−Ｎｏｒと差し替える。
【００６９】
必要に応じて、アーティキュレーションエレメントの追加及び削除もステップＳ２５で行う。望みのエレメントベクトルデータの差し替え及び／又は追加、削除を終えると、新規のアーティキュレーションエレメントシーケンスが作成されたことになる（ステップＳ２６）。
アーティキュレーションエレメントの差し替え及び／又は追加、削除によって、新規作成されたアーティキュレーションエレメントシーケンス内におけるエレメント間の波形のつながりが保証されないものとなったので、次のステップＳ２７において、接続ルールデータＲＵＬＥを設定する。次のステップＳ２８では、設定した接続ルールデータＲＵＬＥでよいかどうかを確認する。ＯＫでなければ、ステップＳ２７に戻り、接続ルールデータＲＵＬＥを設定し直す。設定した接続ルールデータＲＵＬＥでＯＫであれば、ステップＳ２９に行く。
ステップＳ２９では、編集処理を続行するかどうかを問い合わせる。編集処理を続行しない場合は、ステップＳ３０に行き、新規作成されたアーティキュレーションエレメントシーケンスをユーザーシーケンスＵＲＳＥＱとしてアーティキュレーションデータベースＡＤＢに登録する。編集処理を続行したければ、ステップＳ２９でＹＥＳとして、ステップＳ２４又はＳ３１に行く。この場合、アーティキュレーションエレメントの差し替え及び／又は追加、削除に戻りたい場合はステップＳ２４に戻るものとし、テンプレートデータの編集に移りたい場合はステップＳ３１に行く。
【００７０】
ステップＳ３１では、テンプレートデータを編集したいアーティキュレーションエレメント（ＡＥ）を選択する。次のステップＳ３２では、選択されたアーティキュレーションエレメント（ＡＥ）の中の所望の楽音要素に対応するテンプレートベクトルデータを該楽音要素に関する別のテンプレートベクトルデータに差し替える。
例えば、アーティキュレーションシーケンスの検索条件の属性として、「アタック・ベンドアップ・ノーマル」と、「少し遅いビブラート」と、「リリース・ノーマル」がステップＳ２１で指定入力され、図５（ａ）に示されたシーケンスＡＥＳＥＱのうち一番近いシーケンスとしてＡＥＳＥＱ＃６がステップＳ２４で選択されたとする。前述の通り、このシーケンスＡＥＳＥＱ＃６のボディ部用のエレメントはＢＯＤ−Ｎｏｒ（ノーマルボディ）であるから、これをステップＳ２５でビブラート用のボディ部のエレメント例えばＢＯＤ−Ｖｉｂ−ｎｏｒ（ボディ・ノーマルビブラート）に差し替える。そして、ステップＳ３１で、このＢＯＤ−Ｖｉｂ−ｎｏｒ（ボディ・ノーマルビブラート）のエレメントを選択し、これを編集の対象とする。そして、望みの「少し遅いビブラート」を実現するために、ステップＳ３２において、ＢＯＤ−Ｖｉｂ−ｎｏｒ（ボディ・ノーマルビブラート）の各テンプレートベクトルのうち、時間テンプレートのベクトルＴＳＣ−Ｂ−ｖｉｂを、ビブラートスピードを少し遅くする時間テンプレートのベクトル（例えばＴＳＣ−Ｂ−ｓｐ２とする）に差し替える。
【００７１】
こうして、ＢＯＤ−Ｖｉｂ−ｎｏｒ（ボディ・ノーマルビブラート）の各テンプレートのうち、時間テンプレートベクトルをＴＳＣ−Ｂ−ｖｉｂからＴＳＣ−Ｂ−ｓｐ２に差し替えた新たなアーティキュレーションエレメントが作成される（ステップＳ３３）。また、シーケンスＡＥＳＥＱ＃６のボディ部用のエレメントを、この新たに作成されたアーティキュレーションエレメントに差し替えてなる、新たなアーティキュレーションエレメントシーケンスが作成される（ステップＳ３３）。
続くステップＳ３４，Ｓ３５，Ｓ３６は前述のステップＳ２７，Ｓ２８，Ｓ２９と同様の処理からなる。すなわち、差し替えたテンプレートデータによって、新規作成されたアーティキュレーションエレメントシーケンス内におけるエレメント間の波形のつながりが保証されないものとなったので、前述と同様に接続ルールデータＲＵＬＥを設定し直す。
【００７２】
ステップＳ３６では、編集処理を続行するかどうかを問い合わせる。編集処理を続行しない場合は、ステップＳ３７に行き、新規作成されたアーティキュレーションエレメント（ＡＥ）をユーザーアーティキュレーションエレメントベクトル（ＡＥＶＱ）としてアーティキュレーションデータベースＡＤＢに登録する。編集処理を続行したければ、ステップＳ３６でＹＥＳとして、ステップＳ３１又はＳ３８に行く。この場合、テンプレートベクトルの差し替えに戻りたい場合はステップＳ３１に戻るものとし、テンプレートデータの具体的内容の編集に移りたい場合はステップＳ３８に行く。
ステップＳ３８では、データ内容を編集したい所要のアーティキュレーションエレメント（ＡＥ）内のテンプレートを選択する。次のステップＳ３９では、選択されたテンプレートのデータをテンプレートデータベースＴＤＢから読み出し、その具体的データ内容を適宜変更する。
【００７３】
例えば、アーティキュレーションシーケンスの検索条件の属性として、「アタック・ベンドアップ・ノーマル」と、「かなり遅いビブラート」と、「リリース・ノーマル」がステップＳ２１で指定入力され、図５（ａ）に示されたシーケンスＡＥＳＥＱのうち一番近いシーケンスとしてＡＥＳＥＱ＃６がステップＳ２４で選択されたとする。前述の通り、このシーケンスＡＥＳＥＱ＃６のボディ部用のエレメントはＢＯＤ−Ｎｏｒ（ノーマルボディ）であるから、これをステップＳ２５でビブラート用のボディ部のエレメント例えばＢＯＤ−Ｖｉｂ−ｎｏｒ（ボディ・ノーマルビブラート）に差し替える。そして、ステップＳ３１で、このＢＯＤ−Ｖｉｂ−ｎｏｒ（ボディ・ノーマルビブラート）のエレメントを選択し、これを編集の対象とする。そして、望みの「かなり遅いビブラート」を実現するために、ステップＳ３２において、ＢＯＤ−Ｖｉｂ−ｎｏｒ（ボディ・ノーマルビブラート）の各テンプレートベクトルのうち、時間テンプレートのベクトルＴＳＣ−Ｂ−ｖｉｂを、既存の時間テンプレートのうちビブラートスピードを最も遅くする時間テンプレートのベクトル（例えばＴＳＣ−Ｂ−ｓｐ１とする）に差し替える。
しかし、この時間テンプレートベクトルＴＳＣ−Ｂ−ｓｐ１で指示された時間テンプレートでは、望みの「かなり遅いビブラート」がまだ実現できない場合、ステップＳ３８でこの時間テンプレートベクトルＴＳＣ−Ｂ−ｓｐ１を選択し、ステップ３９でその具体的データ内容を更に遅いビブラートを実現する内容に変更する。また、変更によって作成された新たな時間テンプレートに対して新規のベクトルデータ（例えばＴＳＣ−Ｂ−ｓｐ０とする）を割り当てる。
【００７４】
こうして、新規の時間テンプレートデータとそのベクトルデータＴＳＣ−Ｂ−ｓｐ０が作成される（ステップＳ４０）。また、時間テンプレートベクトルを新規のベクトルに変更した新たなアーティキュレーションエレメント（ＡＥ）が作成され、また、シーケンスＡＥＳＥＱ＃６のボディ部用のエレメントを、この新たに作成されたアーティキュレーションエレメント（ＡＥ）に差し替えてなる、新たなアーティキュレーションエレメントシーケンスが作成される（ステップＳ４０）。
続くステップＳ４１，Ｓ４２，Ｓ４３は前述のステップＳ２７，Ｓ２８，Ｓ２９と同様の処理からなる。すなわち、データ修正したテンプレートデータによって、新規作成されたアーティキュレーションエレメントシーケンス内におけるエレメント間の波形のつながりが保証されないものとなったので、前述と同様に接続ルールデータＲＵＬＥを設定し直す。
【００７５】
ステップＳ４３では、編集処理を続行するかどうかを問い合わせる。編集処理を続行しない場合は、ステップＳ４４に行き、新規作成されたテンプレートデータをテンプレートデータベースＴＤＢに登録する。編集処理を続行したければ、ステップＳ４３でＹＥＳとして、ステップＳ３８に戻る。ステップＳ４４の後、ステップＳ３７に行き、新規作成されたアーティキュレーションエレメント（ＡＥ）をユーザーアーティキュレーションエレメントベクトル（ＡＥＶＱ）としてアーティキュレーションデータベースＡＤＢに登録する。更に、ステップＳ３０に行き、新規作成されたアーティキュレーションエレメントシーケンスをユーザーシーケンスＵＲＳＥＱとしてアーティキュレーションデータベースＡＤＢに登録する。
編集処理の手順は図１２に限定されるものではなく、適宜別の手順で処理してもよい。また、前述のように、基となるシーケンスＡＥＳＥＱを呼び出すことなく、所望のエレメントをエレメントベクトルＡＥＶＱから順次選択し、各エレメント内のテンプレートデータを適宜差し替えたりデータ修正したりして、これに基づきユーザーシーケンスＵＲＳＥＱを作り上げるようにしてもよい。また、特に、図示しなかったが、編集処理の適宜の段階において、編集中のアーティキュレーションエレメントの波形に対応する音を発音し、ユーザーが耳で確認できるようにするとよい。
【００７６】
〔パーシャルベクトルの説明〕
図１３は、パーシャルベクトルＰＶＱの考え方を概念的に示すものである。図１３（ａ）は、或る区間のアーティキュレーションエレメントについて、或る楽音要素（例えば波形）について分析された全区間のデータ（つまり通常のテンプレートデータ）を模式的に示したものである。図１３（ｂ）は、（ａ）に示す全区間のデータから分散的に取り出した部分的なテンプレートデータＰＴ１，ＰＴ２，ＰＴ３，ＰＴ４を模式的に示すものである。この部分的なテンプレートデータＰＴ１，ＰＴ２，ＰＴ３，ＰＴ４が、当該楽音要素のテンプレートデータとしてテンプレートデータベースＴＤＢに記憶される。このテンプレートデータについてのテンプレートベクトルは、通常と同様に（全区間のデータをそのままテンプレートデータとして記憶する場合と同様に）、１つ割り当てられる。例えば、このテンプレートデータについてのテンプレートベクトルが「Ｔｉｍｂ−Ｂ−ｎｏｒ」であるとすると、各部分的なデータＰＴ１，ＰＴ２，ＰＴ３，ＰＴ４のテンプレートベクトルは「Ｔｉｍｂ−Ｂ−ｎｏｒ」であり、共通している。なお、この場合、このテンプレートベクトル「Ｔｉｍｂ−Ｂ−ｎｏｒ」に付属するデータとして、パーシャルベクトルＰＶＱを有することを示す識別データを、登録しておくものとする。
【００７７】
パーシャルベクトルＰＶＱは、各部分的なテンプレートデータＰＴ１〜ＰＴ４毎に、該データのテンプレートデータベースＴＤＢでの記憶位置を示すデータ（例えばループスタートアドレスに相当）と、該データの幅Ｗを示すデータ（例えばループエンドアドレスに相当）と、該データを繰返す期間ＬＴを示すデータとを含んでいる。図では、便宜上、幅Ｗと期間ＬＴがどの部分的データＰＴ１〜ＰＴ４でも共通しているかのように図示しているが、これは各データＰＴ１〜ＰＴ４毎に任意である。また、部分的テンプレートデータＰＴ１〜ＰＴ４の数も、４個に限らず、任意である。
パーシャルベクトルＰＶＱに基づく各部分的テンプレートデータＰＴ１〜ＰＴ４をそれぞれその繰返し期間（ＬＴ）の分だけループ読み出しし、読み出された各ループを接続することにより（ａ）に示したような全区間のデータを再現することができる。この再現処理をデコード処理ということにする。このデコード処理法としては、一例として、それぞれの部分的テンプレートデータＰＴ１〜ＰＴ４をその繰返し期間ＬＴの分だけ単純にループ読出しするようにするだけでもよいし、別の例として、相前後する２つの波形をループ読出しながらクロスフェード合成するようにしてもよい。後者の方が各ループのつながりが良くなるので、好ましい。
【００７８】
図１３（ｃ），（ｄ）は、そのようなクロスフェード合成によるデコード処理例を示している。（ｃ）はクロスフェード合成用の第１のチャンネルにおけるクロスフェード制御波形例を示し、（ｄ）はクロスフェード合成用の第２のチャンネルにおけるクロスフェード制御波形例を示す。すなわち、最初の部分的テンプレートデータＰＴ１を（ｃ）に示すフェードアウト用制御波形ＣＦ１１で期間ＬＴの間にフェードアウトし、同時に、次の部分的テンプレートデータＰＴ２を（ｄ）に示すフェードイン用制御波形ＣＦ２１で期間ＬＴの間にフェードインする。フェードアウト制御されたデータＰＴ１とフェードイン制御されたデータＰＴ２とを加算することにより、期間ＬＴの間でデータＰＴ１からデータＰＴ２にクロスフェードするループ読出しが行われる。次に、データＰＴ１をデータＰＴ３に切換える共にその制御波形をフェードイン波形ＣＦ１２に切換え、データＰＴ２の制御波形をフェードアウト波形ＣＦ２２に切換え、クロスフェード合成を行う。以後、図示のように順次切換えてクロスフェード合成を行う。なお、クロスフェード合成を行うに際しては、２つのループ読出波形の位相とピッチが適切に合うように処理する。
【００７９】
図１４は、パーシャルベクトルＰＶＱを考慮したテンプレート読出し処理の一例を示すフロー図である。ここに示されたステップＳ１３〜Ｓ１４ｃは、図７のステップＳ１３，Ｓ１４の部分の処理に対応している。ステップＳ１３では、アーティキュレーションエレメントベクトルＡＥＶＱのデータ群の中から指定されたエレメントに対応する各テンプレートのベクトルデータを読み出す。ステップＳ１４ａでは、パーシャルベクトルＰＶＱを有することを示す識別データに基づきパーシャルベクトルＰＶＱが有るか否かをチェックする。パーシャルベクトルＰＶＱがなければ、ステップＳ１４ｂに行き、テンプレートデータベースＴＤＢから各テンプレートデータを読み出す。パーシャルベクトルＰＶＱが有れば、ステップＳ１４ｃに行き、そのパーシャルベクトルＰＶＱに基づき上述の「デコード処理」を行う。これにより、該エレメントについての全区間のテンプレートデータを再現（デコード）する。
【００８０】
なお、或るアーティキュレーションエレメントにパーシャルベクトルＰＶＱを適用する場合、そのアーティキュレーションエレメントの全ての楽音要素についてのテンプレートを部分的テンプレートとする必要はなく、部分的テンプレートとしてループ読出しするのに適した種類の楽音要素に関してのみ部分的テンプレートとすればよい。
また、パーシャルベクトルＰＶＱに基づく、当該エレメントについての全区間のテンプレートデータの再生方法としては、上述のような単純なループ読出しに限らず、その他適宜の方法を用いてよい。例えば、該パーシャルベクトルＰＶＱに対応する所定長の部分的テンプレートを必要なだけ時間軸伸張する、あるいは限られた複数の部分的テンプレートをランダムに又は所定のシーケンスで組み合わせて当該エレメントについての全区間または必要な区間にわたって配置する、などの方法を用いてよい。
【００８１】
〔ビブラート合成の説明〕
ここでは、ビブラート合成の仕方についての新しいアイディアについていくつか説明する。
図１５は、ビブラート成分を持つボディ部の波形データをパーシャルベクトルＰＶＱの考え方を適用してデータ圧縮する例と、そのデコード例とを概略的に示す図である。（ａ）は、ビブラートを含むオリジナル波形Ａを例示する。このオリジナル波形においては、ビブラートの１周期において波形ピッチが変動しているのみならず、振幅も変動している。（ｂ）は（ａ）のオリジナル波形から分散的に複数の波形ａ１，ａ２，ａ３，ａ４を取り出した状態を例示する。これらの波形ａ１〜ａ４としては、波形形状（音色）がそれぞれ異なっているものを選び、また、１波長（波形１周期）を同じデータサイズ（アドレス数）としてそれぞれ１又は複数波で取り出す。これらの波形ａ１〜ａ４を部分的テンプレートデータ（つまりループ波形データ）としてテンプレートデータベースＴＤＢに記憶する。この読出し法は、各波形ａ１〜ａ４を順次ループ読出しすると共にクロスフェード合成することにより行う。
【００８２】
図１５（ｃ）はビブラート１周期の間にピッチが変動するピッチテンプレートを示している。なお、このピッチテンプレートのピッチ変化パターンは図示では高ピッチから始まって低ピッチに移行し、最後に高ピッチに戻るパターンであるが、これに限らず、他のパターン（例えば低ピッチから高ピッチに移行し、低ピッチに戻るパターンや、中間のピッチから始まって高ピッチ→低ピッチ→中間ピッチに戻るパターンなど）であってもよい。
【００８３】
図１５（ｄ）はループ読出した各波形ａ１〜ａ４に対するクロスフェード制御波形を例示している。（ｃ）のピッチテンプレートに従うピッチで最初は波形ａ１とａ２をそれぞれループ読出し（繰返し読出し）し、ループ読出した波形ａ１に対してはフェードアウト、ループ読出した波形ａ２に対してはフェードインの振幅制御をして両者を合成する。これにより、波形ａ１からａ２に向かってその波形形状がクロスフェードして順次変化していき、かつそのクロスフェード合成波形のピッチがピッチテンプレートに従うピッチで順次変化する。以下、同様に波形を順次切換えて、ａ２とａ３とで、次にａ３とａ４とで、次にａ４とａ１とで、クロスフェード合成をそれぞれ行う。
【００８４】
図１５（ｅ）は合成された波形データＡ’を示す。この波形データＡ’は、ビブラート１周期の間で、その波形形状が波形ａ１から順にａ４まで滑らかにクロスフェードされて変化していき、かつ、そのピッチはピッチテンプレートに従って変化していくことによりビブラートが付けられたものである。上記のようなビブラート１周期分の波形データＡ’の合成処理を繰り返すことにより、複数のビブラート周期にわたる波形データを合成することができる。その場合、（ｃ）に示すようなビブラート１周期分のピッチテンプレートを必要なビブラート周期数分だけループさせればよい。そのために、パーシャルベクトルＰＶＱの構造が階層的になっていてよい。すなわち、ビブラート１周期分の波形合成のために波形ａ１〜ａ４が上記のように個々にループ読出しされると共に、その全体（ビブラート１周期分）がピッチテンプレートのルーピングに従って更に繰り返されるような階層構造となっていてよい。
【００８５】
図１６は別のビブラート合成の別の例を示す図である。この例では、ビブラートを含むオリジナル波形の複数のビブラート周期にわたる区間Ａ，Ｂ，Ｃから分散的に複数の波形ａ１〜ａ４，ｂ１〜ｂ４，ｃ１〜ｃ４を取り出す。これらの波形ａ１〜ａ４，ｂ１〜ｂ４，ｃ１〜ｃ４は、前述と同様に、波形形状（音色）が異なっているものを選び、また、１波長（波形１周期）を同じデータサイズ（アドレス数）としてそれぞれ１又は複数波で取り出す。これらの波形ａ１〜ａ４，ｂ１〜ｂ４，ｃ１〜ｃ４を部分的テンプレートデータとしてテンプレートデータベースＴＤＢに記憶する。この読出し法は、基本的には、上記例と同様に、各波形ａ１〜ａ４，ｂ１〜ｂ４，ｃ１〜ｃ４を順次ループ読出しすると共にクロスフェード合成するものであるが、上記例と異なるのは、図１６の例では各波形ａ１〜ａ４，ｂ１〜ｂ４，ｃ１〜ｃ４の時間的位置を入れ替えて、クロスフェード合成の対象となる波形を任意に組み合わせることにより、ビブラートにおける波形音色変化のバリエーションを多様な組合せで得ることができるようにしている点である。
【００８６】
例えば、各波形ａ１〜ａ４，ｂ１〜ｂ４，ｃ１〜ｃ４の１ビブラート周期内における相対的時間位置は変えずに、これらの波形の位置の入れ替えを行うと、例えば、ａ１→ｂ２→ｃ３→ａ４→ｂ１→ｃ２→ａ３→ｂ４→ｃ１→ａ２→ｂ３→ｃ４というような波形位置の入れ替えパターンを得ることができる。このような波形位置の入れ替えパターンに従って上記図１５と同様のクロスフェード合成によるビブラート合成処理を行えば、オリジナルの波形位置パターンに従うクロスフェード合成によるビブラート合成処理によって得られるビブラートとは異なる音色変化からなるビブラートを得ることができる。なお、各波形ａ１〜ａ４，ｂ１〜ｂ４，ｃ１〜ｃ４の１ビブラート周期内における相対的時間位置は変えずに、これらの波形の位置の入れ替えを行うようにした理由は、入れ替えによる不自然さが生じないようにするためである。
このような波形位置の入れ替えパターンは、図１６に示した１２個の波形ａ１〜ａ４，ｂ１〜ｂ４，ｃ１〜ｃ４の場合、ビブラート１周期につき３の４乗＝８１通りの組合せがあり、ビブラート３周期では、８１の３乗の組合せがある。従って、ビブラートにおける波形音色変化のバリエーションが極めて多様なものとなる。どの組合せパターンを採用するかはランダム選択するようにすればよい。
【００８７】
図１５又は図１６に示すような手法で作成されたビブラート特性を持つ波形（例えば図１５（ｅ）のＡ’）あるいはその他の手法で作成されたビブラート特性を持つ波形に対しては、ピッチ（Ｐｉｔｃｈ）テンプレート、振幅（Ａｍｐ）テンプレート、時間（ＴＳＣ）テンプレートによって、そのビブラート特性を可変制御することができる。例えば、ピッチ（Ｐｉｔｃｈ）テンプレートによってビブラートの深さを制御することができ、振幅（Ａｍｐ）テンプレートによってビブラートと共に付加される振幅変調の深さを制御することができ、時間（ＴＳＣ）テンプレートによってビブラート１周期を構成する波形の時間長を伸縮制御することによりビブラートの速さを制御する（ビブラート周期を制御する）ことができる。
【００８８】
例えば図１５においては、（ｄ）に示す各クロスフェード区間の時間長を所望の時間（ＴＳＣ）テンプレートに応じて時間軸伸縮制御（ＴＳＣ制御）することにより、楽音再生ピッチ（波形読出アドレスの変化レート）を変化させずに該ＴＳＣ制御を行なった場合は、ビブラート１周期の時間長を伸縮制御することができ、これにより、ビブラート周波数の制御が行なえる。なお、その場合、ＴＳＣテンプレートを、（ｃ）に示すようようなピッチテンプレートと同様にビブラート１周期分に対応して用意した場合は、ビブラート１周期分の該ＴＳＣテンプレートを必要なビブラート周期数分だけループさせればよい。なお、ＴＳＣテンプレートに応じた波形の時間軸伸縮制御に連動して、ピッチ（Ｐｉｔｃｈ）テンプレート及び振幅（Ａｍｐ）テンプレートも時間軸伸縮制御するようにすれば、これらの楽音要素を連動して時間軸伸縮制御することができる。
なお、ピッチテンプレートが示すピッチ変化エンベロープ特性を上下にシフトすることにより、ビブラート波形の楽音再生ピッチを可変制御することもできる。その場合、ＴＳＣテンプレートによる波形の時間軸制御は行わないようにすることにより、楽音再生ピッチにかかわらず、ビブラート１周期の時間長を一定に維持するよう制御することができる。
【００８９】
〔接続ルールＲＵＬＥの説明〕
次に、アーティキュレーションエレメント同士の接続の仕方を記述するルールデータＲＵＬＥの具体例について説明する。
各楽音要素別に、例えば、下記のような接続ルールがある。
（１）波形（Ｔｉｍｂｒｅ）テンプレートの接続ルール
ルール１：直接接続。プリセットされた奏法シーケンス（アーティキュレーションエレメントシーケンスＡＥＳＥＱ）のように、各アーティキュレーションエレメント同士の滑らかな接続が予め保証されている場合は、補間を行うことなく、直接的に接続することで問題ない。
ルール２：先行エレメントの波形Ａの終端部分を引き延ばした補間。この補間例は図１７（ａ）に示すような形態であり、先行エレメントの波形Ａの終端部分を引き延ばして接続用波形Ｃ１を合成する。後続エレメントの波形Ｂはそのまま使用し、先行エレメントの波形Ａの末尾に延びた接続用波形Ｃ１をフェードアウト、後続エレメントの波形Ｂの始まり部分をフェードインで、クロスフェード合成する。接続用波形Ｃ１は、先行エレメントの波形Ａの終端部分の１周期波形または複数周期波形を必要な長さだけ繰り返して形成する。
【００９０】
ルール３：後続エレメントの波形Ｂの先端部分を引き延ばした補間。この補間例は図１７（ｂ）に示すような形態であり、後続エレメントの波形Ｂの先端部分を引き延ばして接続用波形Ｃ２を合成する。先行エレメントの波形Ａはそのまま使用し、先行エレメントの波形Ａの終端部分をフェードアウト、接続用波形Ｃ２をフェードインで、クロスフェード合成する。この場合も、接続用波形Ｃ２は、後続エレメントの波形Ｂの先端部分の１周期波形または複数周期波形を必要な長さだけ繰り返して形成する。
ルール４：先行エレメントの波形Ａの終端部分と後続エレメントの波形Ｂの先端部分の双方を引き延ばした補間。この補間例は図１７（ｃ）に示すような形態であり、先行エレメントの波形Ａの終端部分を引き延ばして合成した接続用波形Ｃ１と、後続エレメントの波形Ｂの先端部分を引き延ばして合成した接続用波形Ｃ２とをクロスフェード合成する。なお、このルール４の場合は、Ｃ１とＣ２のクロスフェード合成期間の分だけ、合成された波形全体の時間が延びることになるので、ＴＳＣ制御によってその分だけ時間軸圧縮処理を施すものとする。
【００９１】
ルール５：図１７（ｄ）に示すように、先行エレメントの波形Ａと後続エレメントの波形Ｂとの間に、予め用意した接続用波形Ｃを挿入する。その際、先行エレメントの波形Ａの終端部分と後続エレメントの波形Ｂの先端部分は、接続用波形Ｃの分だけ一部除去する。あるいは、先行エレメントの波形Ａの終端部分と後続エレメントの波形Ｂの先端部分を削除することなく、接続用波形Ｃを挿入してもよいが、その場合は、合成された波形全体の時間が延びることになるので、ＴＳＣ制御によってその分だけ時間軸圧縮処理を施すものとする。
ルール６：図１７（ｅ）に示すように、先行エレメントの波形Ａと後続エレメントの波形Ｂとの間に、予め用意した接続用波形Ｃを挿入し、その際、先行エレメントの波形Ａの終端部分と接続用波形Ｃの前半部をクロスフェードロスフェード合成し、後続エレメントの波形Ｂの先端部分と接続用波形Ｃの後半部をクロスフェードロスフェード合成する。この場合も、もし、合成された波形全体の時間が延びるか縮むかした場合は、ＴＳＣ制御によってその分だけ時間軸圧縮処理を施すものとする。
【００９２】
（２）その他のテンプレートの接続ルール
波形（Ｔｉｍｂｒｅ）テンプレート以外の他のテンプレート（振幅、ピッチ、時間）のデータは、エンベロープ波形状のシンプルな形態をとるので、２チャンネルのクロスフェード制御波形を使用した複雑な補間処理を使用せずに、もっとシンプルな補間処理で滑らかな接続を実現することができる。特に、エンベロープ波形状のテンプレートデータの補間合成にあたっては、補間結果を本来のテンプレートデータ値に対する差分値（正負符号付き）で生成するようにするのが好ましい。そうすれば、リアルタイムでテンプレートデータベースＴＤＢから読み出した本来のテンプレートデータ値に対して、補間結果たる差分値（正負符号付き）を加算するだけで、滑らかな接続のための補間演算を達成することができることになり、極めて簡単である。
ルール１：直接接続。この例を図１８（ａ）に示す。１番目のエレメントのテンプレート（エンベロープ波形）ＡＥ１の末尾と２番目のエレメントのテンプレート（エンベロープ波形）ＡＥ２−ａの先頭のレベルが一致しており、２番目のエレメントのテンプレート（エンベロープ波形）ＡＥ２−ａの末尾と３番目のエレメントのテンプレート（エンベロープ波形）ＡＥ３の先頭のレベルも一致しているので、補間の必要がない。
【００９３】
ルール２：接続個所前後の局所的な範囲でスムーズ化する補間処理を行う。この例を図１８（ｂ）に示す。１番目のエレメントのテンプレート（エンベロープ波形）ＡＥ１の終端部分と２番目のエレメントのテンプレート（エンベロープ波形）ＡＥ２−ｂの先端部分における所定の範囲ＣＦＴ１で、ＡＥ１からＡＥ２−ｂに滑らかに移行するように補間処理を行う。また、２番目のエレメントのテンプレート（エンベロープ波形）ＡＥ２−ｂの終端部分と３番目のエレメントのテンプレート（エンベロープ波形）ＡＥ３の先端部分における所定の範囲ＣＦＴ２で、ＡＥ２−ｂからＡＥ３に滑らかに移行するように補間処理を行う。
なお、補間の結果得られたデータＥ１’，Ｅ２’，Ｅ３’は、各エレメントの本来のテンプレート値（エンベロープ値）Ｅ１，Ｅ２，Ｅ３に対する差分値（正負符号付き）からなるものとする。そのようにすれば、前述の通り、リアルタイムでテンプレートデータベースＴＤＢから読み出した本来のテンプレートデータ値Ｅ１，Ｅ２，Ｅ３に対して、補間結果たる差分値Ｅ１’，Ｅ２’，Ｅ３’を加算するだけで、滑らかな接続のための補間演算を達成することができることになり、極めて簡単である。
【００９４】
このルール２の補間処理の具体例は、図１９（ａ）（ｂ）（ｃ）に示すように、複数通りのバリエーションがある。
図１９（ａ）の例では、先行エレメントＡＥｎの終了点のテンプレートデータ値ＥＰと後続エレメントＡＥｎ＋１の開始点のテンプレートデータ値ＳＰとの中間のレベルＭＰを目標値として、先行エレメントＡＥｎの終端部分の補間領域ＲＣＦＴにおいて、該先行エレメントＡＥｎのテンプレートデータ値を目標値ＭＰに漸近させるよう補間を行う。その結果、先行エレメントＡＥｎのテンプレートデータの軌跡が、本来のラインＥ１からＥ１’に示すように変わる。また、後続エレメントＡＥｎ＋１の先端部分の補間領域ＦＣＦＴにおいて、該後続エレメントＡＥｎ＋１のテンプレートデータ値を上記中間値ＭＰから開始させ、ラインＥ２で示す本来のテンプレートデータ値の軌跡に漸近させるよう補間を行う。その結果、補間領域ＦＣＦＴにおける後続エレメントＡＥｎ＋１のテンプレートデータ値の軌跡がラインＥ２’に示すように本来の軌跡Ｅ２に漸近する。
【００９５】
図１９（ｂ）の例では、後続エレメントＡＥｎ＋１の開始点のテンプレートデータ値ＳＰを目標値として、先行エレメントＡＥｎの終端部分の補間領域ＲＣＦＴにおいて、該先行エレメントＡＥｎのテンプレートデータ値を目標値ＳＰに漸近させるよう補間を行う。その結果、先行エレメントＡＥｎのテンプレートデータの軌跡が、本来のラインＥ１からＥ１’’に示すように変わる。この場合は、後続エレメントＡＥｎ＋１の先端部分の補間領域ＦＣＦＴは存在しない。
図１９（ｃ）の例では、後続エレメントＡＥｎ＋１の先端部分の補間領域ＦＣＦＴにおいて、該後続エレメントＡＥｎ＋１のテンプレートデータ値を上記先行エレメントＡＥｎの終了点のテンプレートデータ値ＥＰから開始させ、ラインＥ２で示す本来のテンプレートデータ値の軌跡に漸近させるよう補間を行う。その結果、補間領域ＦＣＦＴにおける後続エレメントＡＥｎ＋１のテンプレートデータ値の軌跡がラインＥ２’’に示すように本来の軌跡Ｅ２に漸近する。この場合は、先行エレメントＡＥｎの後端部分の補間領域ＲＣＦＴは存在しない。
図１９においても、補間の結果得た各軌跡Ｅ１’，Ｅ２’，Ｅ１’’，Ｅ２’’を示すデータは、本来のテンプレートデータ値Ｅ１，Ｅ２に対する差分値からなるものとする。
【００９６】
ルール３：エレメントの全区間にわたってスムーズ化する補間処理を行う。この例を図１８（ｃ）に示す。１番目のエレメントのテンプレート（エンベロープ波形）ＡＥ１と、３番目のエレメントのテンプレート（エンベロープ波形）ＡＥ３は変更せずに、その中間の２番目のエレメントのテンプレート（エンベロープ波形）ＡＥ２−ｂのデータを全体的に補間し、その先端は１番目のエレメントのテンプレート（エンベロープ波形）ＡＥ１の末尾に一致し、その終端は３番目のエレメントのテンプレート（エンベロープ波形）ＡＥ３の先頭に一致するようにする。なお、この場合も、補間の結果得られたデータＥ２’は、本来のテンプレート値（エンベロープ値）Ｅ２に対する差分値（正負符号付き）からなるものとする。
このルール３の補間処理の具体例は、図２０（ａ）（ｂ）（ｃ）に示すように、複数通りのバリエーションがある。
図２０（ａ）は、中間のエレメントＡＥｎのみで補間を行う例を示している。Ｅ１は、該エレメントＡＥｎのテンプレートデータ値の本来の軌跡を示す。先行するエレメントＡＥｎ−１の終了点のテンプレートデータ値ＥＰ０と中間のエレメントＡＥｎの本来の開始点のテンプレートデータ値ＳＰとの差に応じて、該エレメントＡＥｎのテンプレートデータ値の軌跡Ｅ１をシフトして、軌跡ＥａからなるテンプレートデータをエレメントＡＥｎの全区間に対応して作成する。また、中間のエレメントＡＥｎの本来の終了点のテンプレートデータ値ＥＰと後続するエレメントＡＥｎ＋１の開始点のテンプレートデータ値ＳＰ１との差に応じて、該エレメントＡＥｎのテンプレートデータ値の軌跡Ｅ１をシフトして、軌跡ＥｂからなるテンプレートデータをエレメントＡＥｎの全区間に対応して作成する。次に、軌跡Ｅａのテンプレートデータと軌跡Ｅｂのテンプレートデータとを、ＥａからＥｂに滑らかに変化するようにクロスフェード補間し、軌跡Ｅ１’からなる補間済みのテンプレートデータをエレメントＡＥｎの全区間に対応して得る。
【００９７】
図２０（ｂ）は、中間のエレメントＡＥｎの全区間でデータ変更を行うと共に、中間のエレメントＡＥｎの終端部分の所定領域ＲＣＦＴと後続エレメントＡＥｎ＋１の先端部分の所定領域ＦＣＦＴとにおいて補間を行う例を示している。
まず、上記と同様に、先行するエレメントＡＥｎ−１の終了点のテンプレートデータ値ＥＰ０と中間のエレメントＡＥｎの本来の開始点のテンプレートデータ値ＳＰとの差に応じて、該エレメントＡＥｎのテンプレートデータ値の軌跡Ｅ１をシフトして、軌跡ＥａからなるテンプレートデータをエレメントＡＥｎの全区間に対応して作成する。
【００９８】
次に、この軌跡Ｅａの終了点のテンプレートデータ値ＥＰａと後続エレメントＡＥｎ＋１の開始点のテンプレートデータ値ＳＰとの中間のレベルＭＰａを目標値として、先行エレメントＡＥｎの終端部分の所定領域ＲＣＦＴにおいて、該先行エレメントＡＥｎの軌跡Ｅａのテンプレートデータ値を目標値ＭＰａに漸近させるよう補間を行う。その結果、先行エレメントＡＥｎのテンプレートデータの軌跡Ｅａが、本来の軌跡からＥａ’に示すように変わる。また、後続エレメントＡＥｎ＋１の先端部分の所定領域ＦＣＦＴにおいて、該後続エレメントＡＥｎ＋１のテンプレートデータ値を上記中間値ＭＰａから開始させ、ラインＥ２で示す本来のテンプレートデータ値の軌跡に漸近させるよう補間を行う。その結果、補間領域ＦＣＦＴにおける後続エレメントＡＥｎ＋１のテンプレートデータ値の軌跡がラインＥ２’に示すように本来の軌跡Ｅ２に漸近する。
【００９９】
図２０（ｃ）は、中間のエレメントＡＥｎの全区間でデータ変更を行うと共に、先行エレメントＡＥｎ−１の終端部分の所定領域ＲＣＦＴと中間エレメントＡＥｎの先端部分の所定領域ＦＣＦＴとにおいて補間を行い、かつ、中間のエレメントＡＥｎの終端部分の所定領域ＲＣＦＴと後続エレメントＡＥｎ＋１の先端部分の所定領域ＦＣＦＴとにおいて補間を行う例を示している。
まず、中間のエレメントＡＥｎのテンプレートデータ値の本来の軌跡Ｅ１を適当なオフセット量ＯＦＳＴだけシフトして、軌跡ＥｃからなるテンプレートデータをエレメントＡＥｎの全区間に対応して作成する。
【０１００】
次に、先行エレメントＡＥｎ−１の終端部分の所定領域ＲＣＦＴと中間エレメントＡＥｎの先端部分の所定領域ＦＣＦＴとにおいて、両者のテンプレートデータの軌跡Ｅ０とＥｃとが滑らかにつながるように補間処理を行い、補間結果としての軌跡Ｅ０’とＥｃ’とを該補間領域において得る。また、中間エレメントＡＥｎの終端部分の所定領域ＲＣＦＴと後続エレメントＡＥｎ＋１の先端部分の所定領域ＦＣＦＴとにおいて、両者のテンプレートデータの軌跡ＥｃとＥ２とが滑らかにつながるように補間処理を行い、補間結果としての軌跡Ｅｃ’’とＥ２’’とを該補間領域において得る。
図２０においても、補間の結果得た各軌跡Ｅ１’，Ｅａ，Ｅａ’，Ｅ２’，Ｅｃ，Ｅｃ’，Ｅｃ’’，Ｅ０’を示すデータは、本来のテンプレートデータ値Ｅ１，Ｅ２，Ｅ０に対する差分値からなるものとする。
【０１０１】
〔接続処理を含む楽音合成処理の概念的説明〕
図２１は、各楽音要素に対応するテンプレートデータ毎に上述の接続処理を行い、接続処理済みのテンプレートデータに基づき楽音合成処理を行うようにした楽音合成装置の構成を概念的に説明するブロック図である。
テンプレートデータ供給ブロックＴＢ１，ＴＢ２，ＴＢ３，ＴＢ４では、それぞれ、先行するアーティキュレーションエレメントに関する波形テンプレートデータＴｉｍｂ−Ｔｎ，振幅テンプレートデータＡｍｐ−Ｔｎ，ピッチテンプレートデータＰｉｔ−Ｔｎ，時間テンプレートデータＴＳＣ−Ｔｎと、後続するアーティキュレーションエレメントに関する波形テンプレートデータＴｉｍｂ−Ｔｎ＋１，振幅テンプレートデータＡｍｐ−Ｔｎ＋１，ピッチテンプレートデータＰｉｔ−Ｔｎ＋１，時間テンプレートデータＴＳＣ−Ｔｎ＋１を供給する。
【０１０２】
ルールデーコード処理ブロックＲＢ１，ＲＢ２，ＲＢ３，ＲＢ４では、当該アーティキュレーションエレメントに関する各楽音要素毎の接続ルールＴｉｍｂＲＵＬＥ，ＡｍｐＲＵＬＥ，ＰｉｔＲＵＬＥ，ＴＳＣＲＵＬＥをデコードし、デコードした接続ルールに従って図１７〜図２０を参照して説明したような接続処理を実行する。例えば、波形テンプレート用のルールデーコード処理ブロックＲＢ１では、図１７を参照して説明したような接続処理（直接接続又はクロスフェード補間）を実行するための処理を行う。
【０１０３】
また、振幅テンプレート用のルールデーコード処理ブロックＲＢ２では、図１８〜図２０を参照して説明したような接続処理（直接接続又は補間）を実行するための処理を行う。この場合、補間結果は前述の通り差分値（正負符号付き）で与えられるので、ブロックＲＢ２から出力された差分値からなる補間データが、加算部ＡＤ２において、テンプレートデータ供給ブロックＴＢ２から供給される本来のテンプレートデータ値に対して加算されるようになっている。同様の理由で、他のルールデーコード処理ブロックＲＢ３，ＲＢ４の各出力と、各テンプレートデータ供給ブロックＴＢ３，ＴＢ４から供給される本来のテンプレートデータ値をそれぞれ加算するための加算部ＡＤ３，ＡＤ４が設けられている。
【０１０４】
こうして、各加算部ＡＤ２，ＡＤ３，ＡＤ４からは、隣接するエレメント間での所要の接続処理を施してなるテンプレートデータＡｍｐ，Ｐｉｔｃｈ，ＴＳＣがそれぞれ出力される。ピッチ制御ブロックＣＢ３は、ピッチテンプレートデータＰｉｔｃｈに従って波形読出し速度を制御するものである。波形テンプレートそのものがオリジナルのピッチ情報を含んでいるため、ラインＬ１を介して該オリジナルのピッチ情報（オリジナルのピッチエンベロープ）をデータベースから受け取り、該オリジナルのピッチエンベロープとピッチテンプレートデータＰｉｔｃｈとの偏差で波形読出し速度を制御する。例えば、オリジナルのピッチエンベロープとピッチテンプレートデータＰｉｔｃｈとが同じ場合は、一定の波形読出し速度で読出しを行えばよいし、オリジナルのピッチエンベロープとピッチテンプレートデータＰｉｔｃｈとが異なっている場合はその偏差分だけ波形読出し速度を可変制御すればよい。また、ピッチ制御ブロックＣＢ３は、ノート指示データを受け付け、該ノート指示データによっても波形読出し速度を制御する。例えば、波形テンプレートデータのオリジナルのピッチがノートＣ４のピッチを基本としているとし、ノートＤ４の音もこのノートＣ４のオリジナルピッチを持つ波形テンプレートデータを利用して発生するものとすると、ノート指示データのノートＤ４とオリジナルのピッチのノートＣ４との偏差に応じて波形読出し速度を制御することとなる。このようなピッチ制御の細部は、公知技術を応用できるため、特に詳しく説明しない。
【０１０５】
波形アクセス制御ブロックＣＢ１では、基本的には、ピッチ制御ブロックＣＢ３から出力される波形読出し速度制御情報に応じて、波形テンプレートデータの各サンプルを順次読み出す。このとき、時間テンプレートデータとして与えられるＴＳＣ制御情報に従って波形読出し態様を制御し、発生音のピッチはピッチ制御ブロックＣＢ３から与えられる波形読出し速度制御情報に応じて決定しつつ、トータルの波形読出し時間はＴＳＣ制御情報に従って可変制御されるようにする。例えば、オリジナルの波形データの時間長よりも発音時間長を伸張する場合は、波形読出し速度はそのままにして、一部の波形部分が重複して読み出されるようにすれば、所望のピッチを維持しつつ発音時間長を伸張することができる。また、オリジナルの波形データの時間長よりも発音時間長を圧縮する場合は、波形読出し速度はそのままにして、一部の波形部分が飛び越されて読み出されるようにすれば、所望のピッチを維持しつつ発音時間長を圧縮することができる。
波形アクセス制御ブロックＣＢ１とクロスフェード制御ブロックＣＢ２とでは、波形テンプレート用のルールデーコード処理ブロックＲＢ１の出力に従って図１７を参照して説明したような接続処理（直接接続又はクロスフェード補間）を実行するための処理を行う。クロスフェード制御ブロックＣＢ２は、パーシャルベクトルＰＶＱに従って部分的波形テンプレートをループ読出しながらクロスフェード処理する場合にも利用される。また、上記ＴＳＣ制御の際に波形接続を滑らかにする場合にも利用される。
【０１０６】
振幅制御ブロックＣＢ４は、発生された波形データに対して振幅テンプレートＡｍｐに応じた振幅エンベロープを付与する。この場合も、波形テンプレートそのものがオリジナルの振幅エンベロープ情報を含んでいるため、ラインＬ２を介して該オリジナルの振幅エンベロープ情報をデータベースから受け取り、該オリジナルの振幅エンベロープと振幅テンプレートデータＡｍｐとの偏差で波形データの振幅を制御する。例えば、オリジナルの振幅エンベロープと振幅テンプレートデータＡｍｐとが同じ場合は、振幅制御ブロックＣＢ４では実質的な振幅制御を行わずに波形データを素通りさせるだけでよい。オリジナルの振幅エンベロープと振幅テンプレートデータＡｍｐとが異なっている場合はその偏差分だけ振幅レベルを可変制御すればよい。
【０１０７】
〔楽音合成装置の具体例〕
図２２は、この発明の実施例に係る楽音合成装置のハードウェア構成例を示すブロック図である。この楽音合成装置は、電子楽器あるいはカラオケ装置又は電子ゲーム装置又はその他のマルチメディア機器又はパーソナルコンピュータ等、任意の製品応用形態をとっていてよい。
図２２に示す構成によれば、ソフトウェア音源を利用してこの発明の実施例に係る楽音合成処理を実行する。この発明に係る楽音データの作成及び楽音合成処理を実現するようにソフトウェアシステムを構築すると共に、付属のメモリ装置に所要のデータベースＤＢを構築する、若しくは外部（ホスト）において構築されたデータベースＤＢに通信回線を介してアクセスする、といった実施形態をとる。
【０１０８】
図２２の楽音合成装置においては、メイン制御部としてＣＰＵ（中央処理部）１０を使用し、このＣＰＵ１０の制御の下で、この発明に係る楽音データの作成及び楽音合成処理を実現するソフトウェアのプログラムを実行すると共に、ソフトウェア音源のプログラムを実行する。勿論、ＣＰＵ１０は、更にはその他の適宜のプログラムも、並行して実行することができる。
ＣＰＵ１０には、ＲＯＭ（リードオンリーメモリ）１１，ＲＡＭ（ランダムアクセスメモリ）１２，ハードディスク装置１３，第１のリムーバブルディスク装置（例えばＣＤ−ＲＯＭドライブ若しくはＭＯドライブ）１４，第２のリムーバブルディスク装置（例えばフロッピーディスクドライブ）１５，表示器１６，キーボード及びマウス等の入力操作装置１７，波形インタフェース１８，タイマ１９，ネットワークインタフェース２０，ＭＩＤＩインタフェース２１等が、データ及びアドレスバス２２を介して接続されている。
【０１０９】
図２３は、波形インタフェース１８の詳細例とＲＡＭ１２内の波形バッファの構成例を示している。波形インタフェース１８は、波形データの取り込み（サンプリング）と出力の両方を制御するものであり、外部からマイクロフォン等によって入力された波形データをサンプリングしてアナログ／ディジタル変換するアナログ／ディジタル変換器（ＡＤＣ）２３と、サンプリングのための第１のＤＭＡＣ（ダイレクトメモリアクセスコントローラ）２４と、所定の周波数のサンプリングクロックＦｓを発生するサンプリングクロック発生回路２５と、波形データの出力を制御する第２のＤＭＡＣ（ダイレクトメモリアクセスコントローラ）２６と、出力波形データをディジタル／アナログ変換するディジタル／アナログ変換器（ＤＡＣ）２７とを含んでいる。なお、第２のＤＭＡＣ２６は、サンプリングクロックＦｓに基づき絶対時刻情報を作成し、ＣＰＵのバス２２に与える働きもする。
【０１１０】
ＲＡＭ１２においては、複数の波形バッファＷ−ＢＵＦを有する。１つの波形バッファＷ−ＢＵＦは、１フレーム分の波形サンプルデータを蓄積する記憶容量（アドレス数）を持つ。例えば、サンプリングクロックＦｓに基づく再生サンプリング周波数が４８ｋＨｚ、１フレーム区間の時間が１０ミリ秒であるとすると、１つの波形バッファＷ−ＢＵＦは、４８０サンプルの波形サンプルデータを記憶する容量を持つ。少なくとも２つの波形バッファＷ−ＢＵＦ（Ａ，Ｂ）が使用され、１つの波形バッファＷ−ＢＵＦが読み出しモードとされて波形インタフェース１８のＤＭＡＣ２６によってアクセスされるとき、他の波形バッファＷ−ＢＵＦは書き込みモードとされ、生成した波形サンプルデータを書き込む。この実施例に係る楽音合成処理プログラムにおいては、各楽音合成チャンネル毎に、１フレーム分の複数サンプルからなる波形サンプルデータを一括して生成し、書き込みモードとなっている１つの波形バッファＷ−ＢＵＦの各サンプル位置（アドレス位置）に各チャンネルの波形サンプルデータが足し込まれる（アキュムレート）される。例えば、１フレームが４８０サンプルからなるとすると、第１の楽音合成チャンネルについての４８０サンプルの波形サンプルデータが一括演算され、これが波形バッファＷ−ＢＵＦの各サンプル位置（アドレス位置）にそれぞれストアされる。次に、第２の楽音合成チャンネルについての４８０サンプルの波形サンプルデータが一括演算され、これが同じ波形バッファＷ−ＢＵＦの各サンプル位置（アドレス位置）にそれぞれ足し込まれる（アキュムレート）される。以下、同様である。従って、全チャンネルについての１フレーム分の波形サンプルデータの生成演算を終了したとき、書き込みモードとなっている１つの波形バッファＷ−ＢＵＦの各サンプル位置（アドレス位置）には、全チャンネルの波形サンプルデータを各サンプル毎にアキュムレートした合計波形サンプルデータが蓄積されている。例えば、最初はＡの波形バッファＷ−ＢＵＦに１フレーム分の合計波形サンプルデータの書き込みを行い、次に、Ｂの波形バッファＷ−ＢＵＦに１フレーム分の合計波形サンプルデータの書き込みを行う。Ａの波形バッファＷ−ＢＵＦは、書き込みが終わり次第、次のフレーム区間の始まりから読み出しモードに移行し、当該フレーム区間の間で、サンプリングクロックＦｓに基づく所定の再生サンプリング周期で規則的に読み出される。従って、基本的には、２つの波形バッファＷ−ＢＵＦ（Ａ，Ｂ）の読み書きモードを交互に切り替えて使用すればよいが、数フレーム分先行して書き込みを行えるよう余裕を持たせる場合は、３以上の波形バッファＷ−ＢＵＦ（Ａ，Ｂ，Ｃ，…）を使用してもよい。
【０１１１】
ＣＰＵ１０の制御の下で、この発明に係る楽音データの作成及び楽音合成処理を実現するソフトウェアプログラムは、ＲＯＭ１１，ＲＡＭ１２あるいはハードディスク装置１３あるいはリムーバブルディスク装置１４，１５のいずれに記憶しておくようにしてもよい。また、ネットワークインタフェース２０を介して通信ネットワークに接続し、外部のサーバコンピュータ（図示せず）から、上記“この発明に係る楽音データの作成及び楽音合成処理を実現するプログラム”やデータベースＤＢのデータ等を受け取って、内部のＲＡＭ１２又はハードディスク１３又はリムーバブルディスク装置１４，１５等に格納するようにしてもよい。ＣＰＵ１０は、例えばＲＡＭ１２に記憶されている“この発明に係る楽音データの作成及び楽音合成処理を実現するプログラム”を実行して、奏法シーケンスに従う楽音を合成し、合成した楽音波形データをＲＡＭ１２内の波形バッファＷ−ＢＵＦに一時記憶する。ＤＭＡＣ２６の制御によって、ＲＡＭ１２内の波形バッファＷ−ＢＵＦから波形データを読み出してディジタル／アナログ変換器（ＤＡＣ）２７に送り、Ｄ／Ａ変換する。Ｄ／Ａ変換された楽音波形データはサウンドシステム（図示せず）に与えられ、空間的に発音される。
【０１１２】
図８（ａ）に示したように、ＭＩＤＩデータからなる自動演奏シーケンスデータの中に本発明に従う奏法シーケンス（アーティキュレーションエレメントシーケンスＡＥＳＥＱ）のデータが組み込まれているものとして以下説明を行う。なお、図８（ａ）では特に詳しく述べなかったが、奏法シーケンス（アーティキュレーションエレメントシーケンスＡＥＳＥＱ）のデータは、ＭＩＤＩフォーマットの形態で、例えばＭＩＤＩのエクスクルーシブデータとして組み込むことができる。
【０１１３】
図２４は、ＭＩＤＩフォーマットの演奏データに基づいてソフトウェア音源によって実行される楽音生成処理の概略を示すタイムチャートである。（ａ）に示す「演奏タイミング」は、ＭＩＤＩのノートオンイベントやノートオフイベントあるいはその他のイベント（図８（ａ）におけるＥＶＥＮＴ（ＭＩＤＩ））、及びアーティキュレーションエレメントシーケンスイベント（図８（ａ）におけるＥＶＥＮＴ（ＡＥＳＥＱ））などの各イベント＃１〜＃４の発生タイミングを例示している。（ｂ）は、波形サンプルデータの生成演算を行うタイミング（「波形生成」）と、その再生タイミング（「波形再生」）との関係を例示するものである。上段の「波形生成」の欄は、各楽音合成チャンネル毎に１フレーム分の複数サンプルからなる波形サンプルデータを一括して生成して書き込みモードとなっている１つの波形バッファＷ−ＢＵＦの各サンプル位置（アドレス位置）に各チャンネルの波形サンプルデータを足し込む（アキュムレートする）処理が行われるタイミングを例示している。下段の「波形再生」の欄は、１フレーム区間の間でサンプリングクロックＦｓに基づく所定の再生サンプリング周期で波形バッファＷ−ＢＵＦから波形サンプルデータを規則的に読み出す処理を行うタイミングを示している。それぞれに付記したＡ，Ｂの表示は、書き込み又は読み出しの対象となっている波形バッファＷ−ＢＵＦがどれであるかを区別する記号である。ＦＲ１，ＦＲ２，ＦＲ３，…は、仮に付けた各フレームの番号である。例えば、フレームＦＲ１のときに波形生成演算がなされた或る１フレーム分の波形サンプルデータがＡの波形バッファＷ−ＢＵＦに書き込まれ、これが、次のフレームＦＲ２において該Ａの波形バッファＷ−ＢＵＦから読み出される。次の１フレーム分の波形サンプルデータはフレームＦＲ２において生成演算がなされ、Ｂの波形バッファＷ−ＢＵＦに書き込まれる。このＢの波形バッファＷ−ＢＵＦに記憶した１フレーム分の波形サンプルデータが、更に次のフレームＦＲ３において該Ｂの波形バッファＷ−ＢＵＦから読み出される。（ａ）に示すイベント＃１，＃２，＃３は、１フレームの時間内で起こっており、これらのイベント＃１，＃２，＃３に対応する波形サンプルデータの生成演算は、（ｂ）のフレームＦＲ３において開始される。従って、これらのイベント＃１，＃２，＃３に対応する楽音の立上り（発音開始）は、その次のフレームＦＲ４において開始される。Δｔは、ＭＩＤＩ演奏データとして与えられたイベント＃１，＃２，＃３の発生タイミングと、それに対応する楽音が発音開始されるタイミングとのずれを示している。この時間ずれΔｔは、１乃至数フレーム分だけなので、聴感上問題ない。なお、発音開始時の波形サンプルデータは、波形バッファＷ−ＢＵＦの初めから書き込まれるのではなく、開始時点に対応する波形バッファＷ−ＢＵＦの所定の途中の位置から書き込まれるようになっている。
【０１１４】
なお、「波形生成」における波形サンプルデータの生成演算の方式は、通常のＭＩＤＩのノートオンイベントに基づく自動演奏音（これを「通常演奏」音ということにする）と、アーティキュレーションエレメントシーケンスＡＥＳＥＱのオンイベントに基づく演奏音（これを「奏法演奏」音ということにする）とでは、異なっている。通常のＭＩＤＩのノートオンイベントに基づく「通常演奏」処理と、アーティキュレーションエレメントシーケンスＡＥＳＥＱのオンイベントに基づく「奏法演奏」処理は、図２９及び図３０に示すような、それぞれ別々の処理ルーチンで実行される。例えば、伴奏パートを通常のＭＩＤＩのノートオンイベントに基づく「通常演奏」で行い、特定のソロ演奏パートをアーティキュレーションエレメントシーケンスＡＥＳＥＱに基づく「奏法演奏」で行う、といった使い分けを行うと、効果的である。
【０１１５】
図２５は、本発明に従う奏法シーケンス（アーティキュレーションエレメントシーケンスＡＥＳＥＱ）のデータに基づく「奏法演奏」処理（アーティキュレーションエレメントの楽音合成処理）の概略を示すタイムチャートである。「フレーズ準備コマンド」と「フレーズスタートコマンド」は、図８（ａ）に示すように「アーティキュレーションエレメントシーケンスイベントＥＶＥＮＴ（ＡＥＳＥＱ）」として、ＭＩＤＩ演奏データの中に含まれているものである。すなわち、１つのアーティキュレーションエレメントシーケンスＡＥＳＥＱ（図２５では「フレーズ」と称している）のイベントデータは、「フレーズ準備コマンド」と「フレーズスタートコマンド」とからなっている。先行するイベントデータである「フレーズ準備コマンド」は、再生すべきアーティキュレーションエレメントシーケンスＡＥＳＥＱ（すなわちフレーズ）を指定し、その再生を行う準備をすべきことを指示するもので、当該アーティキュレーションエレメントシーケンスＡＥＳＥＱの発音開始時点よりも所定時間だけ先行して与えられる。ブロック３０で示した「準備処理」のプロセスでは、「フレーズ準備コマンド」に応じて、指定されたアーティキュレーションエレメントシーケンスＡＥＳＥＱを再生するために必要なすべてのデータをデータベースＤＢから取り出し、ＲＡＭ１２の所定のバッファエリアにダウンロードし、該アーティキュレーションエレメントシーケンスＡＥＳＥＱを展開して即座に該アーティキュレーションエレメントシーケンスの再生処理が行えるように、必要な準備を行う。また、この「準備処理」のプロセスでは、指定されたアーティキュレーションエレメントシーケンスＡＥＳＥＱを解釈し、相前後するアーティキュレーションエレメントを接続するルール等を設定若しくは決定して、必要な接続制御データ等を形成する処理も行う。例えば、指定されたアーティキュレーションエレメントシーケンスＡＥＳＥＱが、図示のように５つのアーティキュレーションエレメントＡＥ＃１〜ＡＥ＃５からなるとすると、それぞれの接続箇所（接続１〜接続４として指摘した箇所）における接続ルールを確定し、そのための接続制御データを形成する。また、各アーティキュレーションエレメントＡＥ＃１〜ＡＥ＃５の開始時刻を示すデータを、フレーズ開始時からの相対時間表現で準備する。「フレーズ準備コマンド」に後続するイベントデータである「フレーズスタートコマンド」は、当該アーティキュレーションエレメントシーケンスＡＥＳＥＱの発音開始を指示するものである。この「フレーズスタートコマンド」に応じて、前記「準備処理」で準備された各アーティキュレーションエレメントＡＥ＃１〜ＡＥ＃５を順次再生する。すなわち各アーティキュレーションエレメントＡＥ＃１〜ＡＥ＃５の開始時刻が到来したら、該当するアーティキュレーションエレメントＡＥ＃１〜ＡＥ＃５の再生を開始し、かつ、それぞれの接続箇所（接続１〜接続４）で、予め準備した接続制御データに従って、先行するアーティキュレーションエレメントＡＥ＃１〜ＡＥ＃４に滑らかに接続されるように所定の接続処理を施す。
【０１１６】
図２６は、図２２のＣＰＵ１０が実行する楽音合成処理のメインルーチンを示すフローチャートである。このメインルーチンの「自動演奏処理」によって、自動演奏シーケンスデータのイベントに基づく処理が行われる。まず、ステップＳ５０では、ＲＡＭ１２上での各種バッファ領域の確保等、必要な各種の初期設定処理を行う。次に、ステップＳ５１では、下記の各起動要因が発生しているか否かのチェックを行う。
起動要因▲１▼：インターフェース２０，２１を介してＭＩＤＩ演奏データまたはその他の通信入力データが入力されたこと。
起動要因▲２▼：自動演奏処理タイミングが到来したこと。自動演奏における次のイベントの発生時間をチェックするために、この自動演奏処理タイミングは規則的に発生する。
起動要因▲３▼：１フレーム単位の波形生成タイミングが到来したこと。１フレーム単位でまとめて波形サンプルデータを生成するために、この波形生成タイミングは１フレーム周期で（たとえばフレーム区間の終わりのタイミングで）発生する。
起動要因▲４▼：入力操作装置１７でキーボート又はマウス等のスイッチ操作（メインルーチンの終了指示操作を除く）が行われたこと。
起動要因▲５▼：ディスクドライブ１３〜１５や表示器１６からの割込み要求があったこと。
起動要因▲６▼：入力操作装置１７でメインルーチンの終了指示操作が行われたこと。
【０１１７】
ステップＳ５２では、いずれかの起動要因▲１▼〜▲６▼が発生したかを判断する。ＮＯであれば、ステップＳ５１，Ｓ５２を繰り返し、ＹＥＳとなったら、ステップＳ５３で、どの起動要因が発生したのかを判定する。起動要因▲１▼が発生した場合はステップＳ５４で所定の「通信入力処理」を行う。起動要因▲２▼が発生した場合はステップＳ５５で所定の「自動演奏処理」（その一例を図２７に示す）を行う。起動要因▲３▼が発生した場合はステップＳ５６で所定の「音源処理」（その一例を図２８に示す）を行う。起動要因▲４▼が発生した場合はステップＳ５７で所定の「ＳＷ処理」（操作されたスイッチに対応する処理）を行う。起動要因▲５▼が発生した場合はステップＳ５８で所定の「その他処理」（割込み要求に応じた処理）を行う。起動要因▲６▼が発生した場合はステップＳ５９で所定の「終了処理」（このメインルーチンを終了させる処理）を行う。
【０１１８】
なお、ステップＳ５３において、起動要因▲１▼乃至▲６▼のうちの２以上の起動要因が同時的に発生していると判断された場合には、所定の優先順位で（例えば起動要因▲１▼，▲２▼，▲３▼，▲４▼，▲５▼，▲６▼の順）処理されるものとする。その場合、対等の優先順位の処理があってもよい。また、ステップＳ５１〜Ｓ５３は、擬似マルチタスク処理におけるタスク管理を仮想的に示したものであり、実際には、いずれかの起動要因の発生に基づいて処理を実行している途中で、それよりも優先順位の高い起動要因が発生したことにより、割込みで別の処理を実行すること（例えば、起動要因▲３▼の発生に基づいて「音源処理」を実行している途中で、起動要因▲２▼が発生したことにより、割込みで「自動演奏処理」を実行すること等）がある。
【０１１９】
図２７により、「自動演奏処理」（ステップＳ５５）の具体例につき説明する。まず、ステップＳ６０では、ＤＭＡＣ２６（図２３）から与えられる絶対時刻情報を、曲データの次のイベントタイミングとを比較する処理を行う。図８に示すように、曲データつまり自動演奏データにおいては、イベントデータＥＶＥＮＴに先行してデュレーションデータＤＵＲが存在している。例えば、デュレーションデータＤＵＲが読み出されたときに、そのときの絶対時刻情報とデュレーションデータＤＵＲを加算して次イベント到来時刻を示す絶対時刻情報を作成し、ストアしておく。そして、この次イベント到来時刻を示す絶対時刻情報と現時点での絶対時刻情報と図２７のステップＳ６０で比較する。
【０１２０】
ステップＳ６１では、現時点の絶対時刻が次イベント到来時刻に一致又は経過したか否かを判定する。まだ次イベント到来時刻になっていなければ、図２７の処理を直ちに終了する。次イベント到来時刻になったならば、ステップＳ６２に行き、該イベントの種類が、通常演奏のイベント（つまり通常のＭＩＤＩイベント）であるか、奏法演奏のイベント（つまりアーティキュレーションエレメントシーケンスイベント）であるかを調べる。通常演奏であれば、ステップＳ６３に行き、そのイベントに応じた通常のＭＩＤＩイベント処理を行い、音源制御データを生成する。次のステップＳ６４では、当該イベントに係る楽音合成チャンネル（図では「音源ｃｈ」と略記してある）を検出し、該チャンネルの番号をチャンネル番号レジスタｉに登録する。例えば、ノートオンイベントの場合は、該ノートの発生を割り当てるチャンネルを決定し、該チャンネルをレジスタｉに登録する。また、ノートオフイベントの場合は、該ノートの発生が割り当てられていチャンネルを検出し、該チャンネルをレジスタｉに登録する。次のステップＳ６５では、レジスタｉによって指示されたチャンネル番号のトーンバッファＴＢＵＦ（ｉ）に、ステップＳ６３で生成した音源制御データと制御タイミングデータとを格納する。なお、制御タイミングとは、当該イベントに係る制御を行うタイミングであり、ノートオンイベントの場合は発音開始タイミング、ノートオフイベントの場合はリリース開始タイミング等である。この実施例では、ソフトウェア処理によって楽音波形を発生するようにしているため、ＭＩＤＩデータのイベント発生タイミングとそれに対応する実際の処理のタイミングが少しずれるので、そのずれを考慮して、発音開始タイミング等、実際の制御タイミングを指示し直しているのである。
【０１２１】
ステップＳ６２で奏法演奏のイベントであると判定された場合は、ステップＳ６６に行き、それが「フレーズ準備コマンド」と「フレーズスタートコマンド」（図２５参照）のどちらであるのかを調べる。「フレーズ準備コマンド」であれば、ステップＳ６７〜Ｓ７１のルーチンを実行する。このステップＳ６７〜Ｓ７１のルーチンは、図２５でブロック３０で示した「準備処理」に相当する。まず、ステップＳ６７では、当該フレーズ（つまりアーティキュレーションエレメントシーケンスＡＥＳＥＱ）を再生する楽音合成チャンネル（図では「音源ｃｈ」と略記）を決定し、そのチャンネル番号をレジスタｉに登録する。次のステップＳ６８では、当該フレーズ（つまりアーティキュレーションエレメントシーケンスＡＥＳＥＱ）の奏法シーケンス（図では「奏法ＳＥＱ」と略記）を展開する。すなわち、当該アーティキュレーションエレメントシーケンスＡＥＳＥＱを個別テンプレートを指示可能なベクトルデータのレベルまで分解し、解析して、各アーティキュレーションエレメント（図２５のＡＥ＃１〜ＡＥ＃５）の接続箇所（接続１〜接続４）における接続ルールを確定し、そのための接続制御データを形成する。ステップＳ６９では、サブシーケンス（図では「サブＳＥＱ」と略記）があるかを調べ、あれば、ステップＳ６８に戻り、該サブシーケンスを個別テンプレートを指示可能なベクトルデータのレベルまで更に分解する。
【０１２２】
アーティキュレーションエレメントシーケンスＡＥＳＥＱがサブシーケンスを含む一例を図３２に示す。図３２に示すように、アーティキュレーションエレメントシーケンスＡＥＳＥＱは階層化構造を具備していてよい。すなわち、図で、「奏法ＳＥＱ＃２」が、ＭＩＤＩ演奏情報の中に組み込まれたアーティキュレーションエレメントシーケンスＡＥＳＥＱのデータによって指定されたものであるとすると、この指定されたシーケンス「奏法ＳＥＱ＃２」は、「奏法ＳＥＱ＃６」と「エレメントベクトルＥ−ＶＥＣ＃５」とによって特定される。この「奏法ＳＥＱ＃６」がサブシーケンスに相当する。このサブシーケンスを解析することにより、「奏法ＳＥＱ＃６」が、エレメントベクトルＥ−ＶＥＣ＃２とＥ−ＶＥＣ＃３とによって特定される。こうして、ＭＩＤＩ演奏情報の中に組み込まれたアーティキュレーションエレメントシーケンスＡＥＳＥＱのデータによって指定された「奏法ＳＥＱ＃２」が展開され、これが、エレメントベクトルＥ−ＶＥＣ＃２、Ｅ−ＶＥＣ＃３、Ｅ−ＶＥＣ＃５によって特定されるものであることが解析される。前述の通り、このとき、あわせて、各アーティキュレーションエレメントを接続するための接続制御データも必要に応じて形成される。なお、エレメントベクトルＥ−ＶＥＣとは、個別のアーティキュレーションエレメントを具体的に特定するデータのことである。勿論、このような階層化構造を持つ場合に限らず、ＭＩＤＩ演奏情報の中に組み込まれたアーティキュレーションエレメントシーケンスＡＥＳＥＱのデータによって指定された「奏法ＳＥＱ＃２」によって、初めから、各エレメントベクトルＥ−ＶＥＣ＃２、Ｅ−ＶＥＣ＃３、Ｅ−ＶＥＣ＃５が特定されるようになっている場合もある。
【０１２３】
ステップＳ７０では、展開された各エレメントベクトル（図では「Ｅ−ＶＥＣ」と略記）のデータをその制御タイミングを相対時刻によって示すデータと共に、レジスタｉによって指示されたチャンネル番号のトーンバッファＴＢＵＦ（ｉ）に、格納する。この場合、制御タイミングは、図２５に示したような、各アーティキュレーションエレメントの開始タイミングである。次のステップＳ７１では、トーンバッファＴＢＵＦ（ｉ）を参照して、必要なテンプレートデータをデータベースＤＢからＲＡＭ１２にロードする。
今回のイベントが「フレーズスタートコマンド」（図２５参照）である場合は、ステップＳ７２〜Ｓ７４のルーチンを実行する。このステップＳ７２では、当該フレーズ演奏を再生することが割り当てられているチャンネルを検出し、そのチャンネル番号をレジスタｉに登録する。次のステップＳ７３では、レジスタｉによって指示されたチャンネル番号のトーンバッファＴＢＵＦ（ｉ）に格納されている全ての制御タイミングデータを絶対時刻表現のデータに変換する。すなわち、当該「フレーズスタートコマンド」が発生したときにＤＭＡＣ２６から与えられた絶対時刻情報を初期値として、各制御タイミングデータの相対時刻に該初期値を加算することで、各制御タイミングデータを絶対時刻表現のデータに変換することができる。次のステップＳ７４では、トーンバッファＴＢＵＦ（ｉ）の内容を変換された各制御タイミングの絶対時刻に応じて書き直す。すなわち、該奏法シーケンスを構成する各エレメントベクトルＥ−ＶＥＣの開始時刻と終了時刻、各エレメントベクトル間の接続制御データ等をトーンバッファＴＢＵＦ（ｉ）に書き込む。
【０１２４】
次に、図２８により、「音源処理」（図２６のステップＳ５６）の具体例につき説明する。前述の通り、この「音源処理」は１フレーム毎に起動される。まず、ステップＳ７５では、所定の波形生成準備処理を行う。例えば、前フレーム区間において再生読み出しが完了した波形バッファＷ−ＢＵＦの内容をクリアし、今回のフレーム区間において該波形バッファＷ−ＢＵＦにデータを書き込むことができるようにする。次のステップＳ７６では、発音処理を行うべきチャンネルが存在しているかどうかを調べる。なければ、処理を続ける必要がないので、ステップＳ８３にジャンプする。あれば、ステップＳ７７に行き、発音処理を行うべきチャンネルのうちの１つのチャンネルを特定し、該チャンネルについて波形サンプルデータ生成処理を行う準備をする。次のステップＳ７８では、該準備したチャンネルに割り当てられている楽音の種類が、「通常演奏」音と「奏法演奏」音のどちらであるかを調べる。「通常演奏」音であれば、ステップＳ７９に行き、当該チャンネルについての１フレーム分の波形サンプルデータを、「通常演奏」音として、生成する処理を行う。「奏法演奏」音であれば、ステップＳ８０に行き、当該チャンネルについての１フレーム分の波形サンプルデータを、「奏法演奏」音として、生成する処理を行う。次に、ステップＳ８１では、発音処理を行うべきチャンネルのうち残りの（未処理の）チャンネルがあるかどうかを調べる。あれば、ステップＳ８２に行き、残りの（未処理の）チャンネルの中から次に処理すべきチャンネルを特定し、該チャンネルについて波形サンプルデータ生成処理を行う準備をする。それから、前記ステップＳ７８に戻り、前述と同様のステップＳ７８〜８０の処理を新たなチャンネルに関して実行する。発音処理を行うべき全てのチャンネルに関してステップＳ７８〜８０の処理を完了すると、残りの（未処理の）チャンネルが無しと成るので、ステップＳ８１はＮＯとなり、ステップＳ８３に行く。この状態では、発音すべき全チャンネルについての１フレーム分の波形サンプルデータの生成が終了し、それらが各サンプル毎に足し込まれて（アキュムレートされ）、波形バッファＷ−ＢＵＦに格納されている。ステップＳ８３では、該波形バッファＷ−ＢＵＦのデータを波形入出力（Ｉ／Ｏ）ドライバの管理下に引き渡す。かくして、次の１フレーム区間において、該波形バッファＷ−ＢＵＦが読み出しモードとなり、ＤＭＡＣ２６によってアクセスされて、所定のサンプリングクロックＦｓに従って波形サンプルデータが規則的サンプリング周期で再生読み出しされることになる。
【０１２５】
図２８のステップＳ７９の処理の詳細例が図２９に示されている。図２９は、「通常演奏」についての「１フレーム分の波形データ生成処理」の一例を示すフロー図であって、ＭＩＤＩ演奏データに基づく通常の楽音合成処理がここで行われる。この処理では、ステップＳ９０〜Ｓ９８のループを１回行う毎に、１サンプルの波形データの生成が行われる。従って、現在処理中のサンプルが１フレームの何番目のサンプルかを示すアドレスポインタ管理がなされるが、その点は特に詳しく説明しない。まず、ステップＳ９０では、制御タイミングが到来したかどうかをチェックする。この制御タイミングは図２７のステップＳ６５で指示し直されたタイミングであり、例えば、発音開始タイミングあるいはリリース開始タイミング（発音終了タイミング）などである。現在処理中のフレームに関して、なんらかの制御タイミングがある場合は、該制御タイミングの時刻に対応するアドレスポインタ値に対応して、このステップＳ９０がＹＥＳとなり、ステップＳ９１に行き、音源制御データに基づく必要な波形発生開始処理を行う。現アドレスポインタ値が制御タイミングに対応していない場合は、ステップＳ９１をジャンプしてステップＳ９２に行く。ステップＳ９２では、ビブラート等に必要な低周波信号（ＬＦＯ）を形成する処理を行う。次のステップＳ９３では、ピッチ制御用のエンベロープ信号（ＥＧ）を形成する処理を行う。
【０１２６】
次のステップＳ９４では、上記音源制御データに基づき、「通常演奏」音のための波形メモリ（図示せず）から所定の音色の波形サンプルデータを、指定された楽音ピッチに対応するレートで読み出し、読み出した波形サンプルデータの値をサンプル間補間する処理を行う。ここでは、通常知られた波形メモリ読み出し技術とサンプル間補間技術とを適宜使用すればよい。ここで指定される楽音ピッチは、ノートオンイベントに係るノート（音高）の正規のピッチを、前ステップＳ９２，９３で形成されたビブラート信号やピッチ制御エンベロープ値などによって可変制御したものである。次のステップＳ９５では、振幅エンベロープ（ＥＧ）を形成する処理を行う。次のステップＳ９６では、ステップＳ９４で生成した１サンプルの波形データの音量レベルを、ステップＳ９５で形成された振幅エンベロープ値によって可変制御し、これを、現アドレスポインタが指示する波形バッファＷ−ＢＵＦのアドレス箇所に既に格納されている波形サンプルデータに足し込む。つまり、同じサンプル点についての他のチャンネルの波形サンプルデータに加算・アキュムレートする。次に、ステップＳ９７では、１フレーム分の処理が完了したかどうかを調べ。まだ完了していなければ、ステップＳ９８に行き、次サンプルを準備する（アドレスポインタを次に進める）。
【０１２７】
上記の構成により、フレームの途中から発音を開始する場合は、該発音開始位置に対応する波形バッファＷ−ＢＵＦの中間的なアドレスから波形サンプルデータが格納されることになる。勿論、１フレーム区間の全体にわたって発音を持続する場合は、波形バッファＷ−ＢＵＦの全アドレスに波形サンプルデータが格納される。
なお、ステップＳ９３，Ｓ９５におけるエンベロープ形成処理は、エンベロープ波形メモリを読み出すことによって行うようにしてもよいし、所定のエンベロープ関数を計算することによって行うようにしてもよい。エンベロープ関数としては、周知の、比較的シンプルな１次の折線関数を演算する方式を用いてよい。なお、後述する「奏法演奏」とは異なり、この「通常演奏」では、発音中の波形の差し替えや、エンベロープの差し替え、あるいは波形の時間軸伸縮制御等、複雑な処理は行わなくてもよい。
【０１２８】
図２８のステップＳ８０の処理の詳細例が図３０に示されている。図３０は、「奏法演奏」についての「１フレーム分の波形データ生成処理」の一例を示すフロー図であって、アーティキュレーション（奏法）シーケンスデータに基づく楽音合成処理がここで行われる。また、この図３０の処理では、各テンプレートデータに基づくアーティキュレーションエレメントの楽音波形処理や、エレメント波形間の接続処理等が既に述べた要領で実行される。図２９と同様に、図３０の処理でも、ステップＳ１００〜Ｓ１０８のループを１回行う毎に、１サンプルの波形データの生成が行われる。従って、現在処理中のサンプルが１フレームの何番目のサンプルかを示すアドレスポインタ管理がなされるが、その点は特に詳しく説明しない。なお、この図３０の処理では、相前後するアーティキュレーションエレメントを滑らかに接続するために、２系列の各種テンプレートデータ（波形テンプレートを含む）をクロスフェード合成したり、時間軸伸縮制御のために２系列の波形サンプルデータをクロスフェード合成したりすることが行われる。よって、１つのサンプル点について、クロスフェード合成のための２系列分の各種データ処理が行われることになる。
【０１２９】
まず、ステップＳ１００では、制御タイミングが到来したかどうかをチェックする。この制御タイミングは図２７のステップＳ７４で書き込まれたタイミングであり、例えば、各アーティキュレーションエレメントＡＥ＃１〜ＡＥ＃５の開始タイミングや接続処理の開始タイミングなどである。現在処理中のフレームに関して、なんらかの制御タイミングがある場合は、該制御タイミングの時刻に対応するアドレスポインタ値に対応して、このステップＳ１００がＹＥＳとなり、ステップＳ１０１に行き、該制御タイミングに対応するエレメントベクトルＥ−ＶＥＣや接続制御データなどに基づく必要な制御を行う。現アドレスポインタ値が制御タイミングに対応していない場合は、ステップＳ１０１をジャンプしてステップＳ１０２に行く。
【０１３０】
ステップＳ１０２では、エレメントベクトルＥ−ＶＥＣによって指定された特定のエレメントについてのタイムテンプレート（図ではテンプレートをＴＭＰと略記）を生成する処理を行う。タイムテンプレートとは図３に示した時間テンプレート（ＴＳＣテンプレート）のことである。この実施例において、タイムテンプレート（ＴＳＣテンプレート）は、振幅テンプレートやピッチテンプレートと同様に、時間的に変化するエンベロープ状のデータとして与えられるものとする。従って、このステップＳ１０２では、タイムテンプレートのエンベロープを形成する処理を行う。
ステップＳ１０３では、エレメントベクトルＥ−ＶＥＣによって指定された特定のエレメントについてのピッチ（Ｐｉｔｃｈ）テンプレートを生成する処理を行う。ピッチテンプレートも図３に例示したように時間的に変化するエンベロープ状のデータとして与えられる。
ステップＳ１０５では、エレメントベクトルＥ−ＶＥＣによって指定された特定のエレメントについての振幅（Ａｍｐ）テンプレートを生成する処理を行う。振幅テンプレートも図３に例示したように時間的に変化するエンベロープ状のデータとして与えられる。
【０１３１】
各ステップＳ１０２，Ｓ１０３，Ｓ１０５におけるエンベロープ形成法は、上記と同様に、エンベロープ波形メモリを読み出すことによって行うようにしてもよいし、所定のエンベロープ関数を計算することによって行うようにしてもよく、また、そのエンベロープ関数としては、比較的シンプルな１次の折線関数を演算する方式を用いてよい。また、図１８〜図２０を用いて説明したように、所定のエレメント接続箇所に対応して、２系列でテンプレートを形成し（先行するエレメントのテンプレートと後続するエレメントのテンプレート）、両者を接続制御データに従ってクロスフェード合成して接続する処理や、オフセット処理などもこれらのステップＳ１０２，Ｓ１０３，Ｓ１０５で行う。どのような接続ルールに従って接続処理を行うかは、それぞれに対応する接続制御データに応じて異なる。
【０１３２】
ステップＳ１０４では、基本的には、エレメントベクトルＥ−ＶＥＣによって指定された特定のエレメントについての波形（Ｔｉｍｂｒｅ）テンプレートを、指定された楽音ピッチに対応するレートで読み出す処理を行う。ここで指定される楽音ピッチは、前ステップＳ１０３で形成されたピッチテンプレート（ピッチ制御エンベロープ値）などによって可変制御されるものである。なお、タイムテンプレート（ＴＳＣテンプレート）に応じて、楽音ピッチとは独立に、波形サンプルデータの存在時間を時間軸に沿って伸張または圧縮する制御つまりＴＳＣ制御も、このステップＳ１０４で行う。また、時間軸伸縮制御に伴って、波形の連続性が損なわれることのないように、２系列で波形サンプルデータ（同じ波形テンプレート内の異なる時点に対応する２つの波形サンプルデータ）を読み出し、これをクロスフェード合成する処理も、このステップＳ１０４で行う。また、「通常演奏」の場合と同様に、波形サンプル間の補間演算処理も、このステップＳ１０４で行う。更に、図１７を用いて説明したように、所定のエレメント接続箇所に対応して、２系列で波形テンプレートを読み出し（先行するエレメントの波形テンプレートと後続するエレメントの波形テンプレート）、両者をクロスフェード合成して接続する処理も、このステップＳ１０４で行う。更に、図１３〜図１６を用いて説明したような、波形テンプレートをループ読み出し（繰り返し読み出し）する処理と、その際に、２系列のループ読み出し波形をクロスフェード合成する処理も、このステップＳ１０４で行う。
なお、使用する波形（Ｔｉｍｂｒｅ）テンプレートが、オリジナル波形における時間的ピッチ変動成分をそのまま保っているものである場合、ピッチテンプレートの値は、オリジナルのピッチ変動に対する変化量（差分値又は比）で与えるようにするとよい。つまり、オリジナルの時間的ピッチ変動そのままにするときは、ピッチテンプレートの値を一定値（例えば「１」）に維持する。
【０１３３】
次のステップＳ１０５では、振幅テンプレートを形成する処理を行う。次のステップＳ１０６では、ステップＳ１０４で生成した１サンプルの波形データの音量レベルを、ステップＳ１０５で形成された振幅エンベロープ値によって可変制御し、これを、現アドレスポインタが指示する波形バッファＷ−ＢＵＦのアドレス箇所に既に格納されている波形サンプルデータに足し込む。つまり、同じサンプル点についての他のチャンネルの波形サンプルデータに加算・アキュムレートする。次に、ステップＳ１０７では、１フレーム分の処理が完了したかどうかを調べ。まだ完了していなければ、ステップＳ１０８に行き、次サンプルを準備する（アドレスポインタを次に進める）。
なお、上述と同様に、使用する波形（Ｔｉｍｂｒｅ）テンプレートが、オリジナル波形における時間的振幅変動成分をそのまま保っているものである場合、振幅（Ａｍｐ）テンプレートの値は、オリジナルの振幅変動に対する変化量（差分値又は比）で与えるようにするとよい。つまり、オリジナルの時間的振幅変動そのままにするときは、振幅テンプレートの値を一定値（例えば「１」）に維持する。
【０１３４】
次に、時間軸伸縮制御（ＴＳＣ制御）の一例について説明する。
複数周期波形からなる高品質な、つまり特定のアーティキュレーション特性を具備する、そして、一定のデータ量（サンプル数若しくはアドレス数）からなる波形データを、その楽音再生ピッチとは独立に、また、該波形の全体的特徴を損なうことなく、時間軸上におけるその存在時間長を任意に可変制御することは、本出願人が別出願（例えば特願平９−１３０３９４号）で提案した時間軸伸縮制御（ＴＳＣ制御）を用いることによって実現できる。このＴＳＣ制御の要点を述べれば、一定の波形データ量からなる複数周期波形を、一定の再生サンプリング周波数と所定の再生ピッチを維持しつつ、その時間軸上の波形データ存在時間長を伸縮するために、圧縮する場合は、波形データの適宜の部分を飛び越して読み出しを行ない、伸張する場合は、波形データの適宜の部分を繰り返し読み出しするようにし、そして、飛び越し若しくは部分的繰り返し読み出しによる波形データの不連続性を除去するためにクロスフェード合成を行なうようにしたものである。
【０１３５】
図３１は、この時間軸伸縮処理（ＴＳＣ制御）の概略を概念的に示す図である。（ａ）は、時間的に変化するタイムテンプレートの一例を示している。タイムテンプレートは、時間軸伸縮比を示すデータ（これをＣＲａｔｅという）からなっており、縦軸が該データＣＲａｔｅ、横軸が時間ｔである。時間軸伸縮比データＣＲａｔｅは、「１」を基準とする比を示しており、「１」のとき時間軸伸縮をしないことを示し、「１」よりも大きいとき時間軸の圧縮を示し、「１」よりも小さいとき時間軸の伸張を示す。図３１の（ｂ）〜（ｄ）は、仮想読出アドレスＶＡＤと実読出アドレスＲＡＤを用いて、時間軸伸縮比データＣＲａｔｅに応じた時間軸伸縮制御を行う例を示している。実線が実読出アドレスＲＡＤ、破線が仮想読出アドレスＶＡＤを示す。（ｂ）は、（ａ）のタイムテンプレートにおけるＰ１点の時間軸伸縮比データＣＲａｔｅ（＞１）に応じた時間軸圧縮制御例を示しており、（ｃ）は、（ａ）のタイムテンプレートにおけるＰ２点の時間軸伸縮比データＣＲａｔｅ（＝１）に応じた時間軸伸縮しない例を示し、（ｄ）は、（ａ）のタイムテンプレートにおけるＰ３点の時間軸伸縮比データＣＲａｔｅ（＜１）に応じた時間軸伸張制御例を示している。（ｃ）においては実線は、ピッチ情報に従う本来の波形読出アドレスの進行状態を示しており、実読出アドレスＲＡＤと仮想読出アドレスＶＡＤが一致している。
【０１３６】
実読出アドレスＲＡＤは、波形テンプレートから実際に波形サンプルデータを読み出すために使用するアドレスであり、所望のピッチ情報に従う一定の変化レートで変化する。例えば、所望のピッチに対応する周波数ナンバを規則的に累算することにより、該ピッチに対応する一定の傾きを持つ実読出アドレスＲＡＤを得ることができる。仮想読出アドレスＶＡＤは、波形データの時間軸上の長さの所望の伸張又は圧縮制御した状態を想定し、所望の時間軸伸張又は圧縮を達成するためには、現時点でどのアドレス位置から波形サンプルデータを読み出すべきかを指示するアドレスである。そのために、所望のピッチ情報と時間軸伸縮比データＣＲａｔｅとを用いて、該ピッチ情報に従う傾きを伸縮比データＣＲａｔｅによって修正した傾きで変化するアドレスデータを、仮想読出アドレスＶＡＤとして発生する。実読出アドレスＲＡＤと仮想読出アドレスＶＡＤとを比較し、実読出アドレスＲＡＤの仮想読出アドレスＶＡＤからのかい離幅が所定幅を越えたとき、実読出アドレスＲＡＤの値を切替えることを指示し、この切替指示に従って、実読出アドレスＲＡＤの仮想読出アドレスＶＡＤに対するかい離を解消するよう、適宜アドレス数だけ実読出アドレスＲＡＤの数値をシフト制御する。
【０１３７】
図３３は、図３１（ｂ）と同様の状態を拡大して示す図である。一点鎖線は、ピッチ情報に従う本来のアドレス進行を例示するもので、図３１（ｃ）の実線に対応するものである。太い破線は、仮想読出アドレスＶＡＤのアドレス進行を例示する。伸縮比データＣＲａｔｅが１であれば、仮想読出アドレスＶＡＤのアドレス進行は、一点鎖線の本来のアドレス進行に一致し、時間軸の変化はない。時間軸を圧縮する場合、伸縮比データＣＲａｔｅは１以上の適宜の値をとり、図示のように、仮想読出アドレスＶＡＤのアドレス進行の傾きが相対的に大きくなる。太い実線は、実読出アドレスＲＡＤのアドレス進行を例示する。この実読出アドレスＲＡＤのアドレス進行の傾きは、一点鎖線で示したピッチ情報に従う本来のアドレス進行の傾きに一致している。この場合、仮想読出アドレスＶＡＤのアドレス進行の傾きが相対的に大きいが故に、時間経過に従って次第に実読出アドレスＲＡＤのアドレス進行が仮想読出アドレスＶＡＤのアドレス進行よりも遅れてくる。そして、そのかい離幅が所定以上になったとき、切替指示（図中、矢印で示す）が出され、図示のように、該かい離を解消する方向に、実読出アドレスＲＡＤを適量シフトする。これによって、実読出アドレスＲＡＤのアドレス進行は、ピッチ情報に従う傾きを維持しつつ、仮想読出アドレスＶＡＤのアドレス進行に沿って変化し、時間軸方向に圧縮された特性を示す。従って、このような実読出アドレスＲＡＤに従って波形テンプレートの波形サンプルデータを読み出すことにより、再生する楽音のピッチは変更せずに、時間軸方向に波形を圧縮した波形信号を得ることができる。
【０１３８】
図３４は、図３１（ｄ）と同様の状態を拡大して示す図である。この場合、伸縮比データＣＲａｔｅは１未満であり、太い破線にて示す仮想読出アドレスＶＡＤのアドレス進行の傾きは相対的に小さい。従って、時間経過に伴い次第に実読出アドレスＲＡＤのアドレス進行が仮想読出アドレスＶＡＤのアドレス進行よりも進んできて、そのかい離幅が所定以上になったとき、切替指示（図中、矢印で示す）が出され、図示のように、該かい離を解消する方向に、実読出アドレスＲＡＤが適量シフトされる。これによって、実読出アドレスＲＡＤのアドレス進行は、ピッチ情報に従う傾きを維持しつつ、仮想読出アドレスＶＡＤのアドレス進行に沿って変化し、時間軸方向に伸張された特性を示す。従って、このような実読出アドレスＲＡＤに従って波形テンプレートの波形サンプルデータを読み出すことにより、再生する楽音のピッチは変更せずに、時間軸方向に波形を伸張した波形信号を得ることができる。
【０１３９】
なお、前記かい離を解消する方向への実読出アドレスＲＡＤのシフトは、このシフトによって、シフト直前に読み出していた波形データと、シフト直後に読み出す波形データとが滑らかにつながるようにすることが好ましい。また、図中、波線で示すように、切替時の適宜期間で、クロスフェード合成を行うようにするとよい。波線は、クロスフェード副系列用実読出アドレスＲＡＤ２のアドレス進行を示す。このクロスフェード副系列用実読出アドレスＲＡＤ２は、図示の通り、上記切替指示が出されたとき、シフト前の実読出アドレスＲＡＤのアドレス進行の延長上に、実読出アドレスＲＡＤと同じレート（つまり傾き）で生成する。適宜のクロスフェード期間において、副系列用実読出アドレスＲＡＤ２に対応して読み出される波形から主系列用実読出アドレスＲＡＤに対応して読み出される波形まで滑らかに波形が移行するようにクロスフェード合成がなされる。この例の場合、少なくとも所要のクロスフェード期間の間でのみ副系列用実読出アドレスＲＡＤ２を生成するようにすればよい。
なお、上記のように一部分でクロスフェード合成を行うＴＳＣ制御例に限らず、時間軸伸縮比ＣＲａｔｅの値に応じた態様のクロスフェード合成処理を常に行うようにしたＴＳＣ制御を採用してもよい。
【０１４０】
図１３〜図１５に示したようなパーシャルベクトルＰＶＱの波形テンプレート（つまりループ波形）を繰り返し読み出すことで波形サンプルデータを生成する場合においては、基本的には、ループ回数を可変することによって、比較的簡単に、楽音再生ピッチとは独立に、ループ読み出し波形全体の時間長を可変制御することができる。つまり、クロスフェード区間長を指定するデータによって特定のクロスフェードカーブが特定されると、それに伴ってクロスフェード区間長（時間長若しくはループ回数）が決まってくる。ここで、このクロスフェードカーブの傾きをタイムテンプレートが示す時間軸伸縮比によって可変制御することにより、クロスフェードの速さが可変制御され、結局、クロスフェード区間の時間長が可変制御される。その間、楽音再生ピッチには影響を与えないので、結局、ループ回数が可変制御されることで当該クロスフェード区間の時間長が可変制御される。
【０１４１】
ところで、時間軸伸縮制御によって、再生波形データの時間軸での存在時間が伸縮制御される場合、この伸縮制御にあわせて、ピッチテンプレート及び振幅テンプレートの時間軸も伸縮制御してやることが望ましい。従って、図３０のステップＳ１０３，Ｓ１０５においては、ステップＳ１０２で作成されたタイムテンプレートに応じて、該ステップで作成するピッチテンプレート及び振幅テンプレートの時間軸を伸縮制御するようにするものとする。
【０１４２】
なお、楽音合成機能のすべてをソフトウェア音源によって構成せずに、ソフトウェア音源とハードウェア音源のハイブリッドタイプとしてもよい。また、ハードウェア音源装置のみでこの発明に係る楽音合成処理を行うようにしてもよい。あるいは、ＤＳＰ（ディジタル・シグナル・プロセッサ）を用いてこの発明に係る楽音合成処理を行うようにしてもよい。また、ソフトウェア音源またはハードウェア音源またはそのハイブリッドタイプのいずれの音源方式を用いる場合でも、その波形形成方式は、単純なＰＣＭ波形メモリ読み出し方式に限らず、前述の通り、各種のデータ圧縮技術を用いた方式や、各種の波形合成アルゴリズムに従うパラメータ演算による方式など、適宜のものを使用することができる。
【０１４３】
【発明の効果】
以上の通り、この発明によれば、音楽的なアーティキュレーションを伴う複数の演奏フレーズのそれぞれについて、該演奏フレーズを構成する１又は複数の音を複数の部分的時間区間に分割し、各部分的時間区間毎のアーティキュレーションエレメントを順次指示するアーティキュレーションエレメントシーケンスを記憶する第１のデータベース部と、様々なアーティキュレーションエレメントに対応する部分的音波形を表現するテンプレートデータを記憶する第２のデータベース部とを具備する楽音データベースのデータ編集を行なうにあたって、所望の奏法を指定し、指定された奏法に該当するアーティキュレーションエレメントシーケンスを前記第１のデータベース部から検索するようにしたので、楽音データの編集作業にあたって、ユーザーは、所望の奏法を指定することで、指定された奏法に該当するアーティキュレーションエレメントシーケンスの有無を検索することができ、該所望の奏法が前記楽音データベースで利用可能か否かをサーチするできる、という優れた効果を奏する。従って、例えば、所望の奏法が楽音データベースで利用可能であれば、それを読み出して、その内容を確認し、望みのものと異なっていればその内容を適宜修正・変更することで望みの楽音データを作成するようにすることが行なえる。一方、所望の奏法が楽音データベースで利用可能でない場合は、該奏法に対応する楽音データを新規に作成して楽音データベースに登録する、といったような編集作業を行なうこともできる。このように、所望の奏法を指定するといった演奏感覚に合った形態で、楽音データの編集作業を進めることができ、ユーザーにとって使い易いものとなる。
【０１４４】
また、前記指定された奏法に一致するアーティキュレーションエレメントシーケンスが検出されなかったならば、該指定された奏法に似ているアーティキュレーションエレメントシーケンスを前記第１のデータベース部から選択し、選択されたアーティキュレーションエレメントシーケンスを構成するアーティキュレーションエレメントのいずれかを変更又は差し替え又は削除するか、若しくは新規のアーティキュレーションエレメントを該シーケンスに追加する、という編集を行うようにしたことにより、指定された奏法に一致するアーティキュレーションエレメントシーケンスがデータベースに記憶されていない場合でも、該指定された奏法に似ているアーティキュレーションエレメントシーケンスを使用して、その内容を自由に編集することにより、所望の奏法に対応する楽音データを容易に作成することができるようになる、という優れた効果を奏する。
【０１４５】
更に、編集されたアーティキュレーションエレメントとそれに隣接するアーティキュレーションエレメントとの間の各テンプレートデータの接続の仕方を設定するようにしたことにより、編集操作の結果、編集対象のアーティキュレーションエレメントに対応する部分的音波形を表現するテンプレートデータの内容が変更されることにより、それに隣接するアーティキュレーションエレメントとの間でのテンプレートデータのつながりが損なわれるおそれが生じるところ、隣接するテンプレートデータの接続の仕方を設定する（つまり定義し直す）ことにより、隣接するテンプレートデータを滑らかにつなげることができるようになる、という優れた効果を奏する。
【０１４６】
このように、この発明によれば、アーティキュレーションエレメントの内容を自由に編集できるようにすることにより、従来にない、“アーティキュレーション”を含む高品質な楽音波形の形成を、ユーザーによるインタラクティブな制御を可能にしつつ、実現することができる、という優れた効果を奏する。また、電子楽器やマルチメディア機器等においてユーザーの自由な音作りと編集操作を許容するインタラクティブな高品質の楽音波形生成技術を提供することができる、という優れた効果を奏する。
【図面の簡単な説明】
【図１】この発明に係る楽音データ作成方法に従う楽音データベース作成手順の一例を示すフロー図。
【図２】一連の楽曲フレーズの楽譜例と、それに対応するアーティキュレーション単位での演奏区間の分割例と、アーティキュレーションエレメントを構成する楽音要素の分析例とを模式的に示す図。
【図３】１つのアーティキュレーションエレメントに対応する波形から分析された複数の楽音要素の具体例を示す図。
【図４】データベースの構成例を示す図。
【図５】図４のアーティキュレーションデータベースＡＤＢにおけるアーティキュレーションシーケンスＡＥＳＥＱとアーティキュレーションエレメントベクトルＡＥＶＱの具体例を示す図。
【図６】属性情報を含むアーティキュレーションエレメントベクトルＡＥＶＱの具体例を示す図。
【図７】この発明に係る楽音データ作成方法に従う楽音合成手順の一例を示すフロー図。
【図８】この発明に係る楽音データ作成方法に従う楽音合成手法を採用した自動演奏シーケンスデータの構成例を示す図。
【図９】この発明に従ういくつかの奏法シーケンスの具体例を示す図。
【図１０】１つの奏法シーケンス内における各アーティキュレーションエレメント相互のクロスフェード合成による接続処理の一例を示す図。
【図１１】奏法シーケンス（アーティキュレーションエレメントシーケンス）の編集例を概観する図。
【図１２】奏法シーケンス（アーティキュレーションエレメントシーケンス）の編集手樹の一例を示すフロー図。
【図１３】パーシャルベクトルの考え方を示す図。
【図１４】パーシャルベクトルを含むアーティキュレーションエレメントの楽音合成処理手順を部分的に示すフロー図。
【図１５】ビブラート合成処理の一例を示す図。
【図１６】ビブラート合成処理の別の例を示す図。
【図１７】波形テンプレートの接続処理例のいくつかのルールを示す図。
【図１８】波形テンプレート以外のテンプレートデータ（エンベロープ波形状のテンプレートデータ）の接続処理例のいくつかのルールを示す図。。
【図１９】図１８（ｂ）に示す接続ルールのいくつかの具体化手段を示す図。
【図２０】図１８（ｃ）に示す接続ルールのいくつかの具体化手段を示す図。
【図２１】各種テンプレートデータの接続処理とテンプレートデータに基づく楽音合成処理の概略を示すブロック図。
【図２２】この発明の実施例に係る楽音合成装置のハードウェア構成例を示すブロック図。
【図２３】図２２における波形インタフェースの詳細例とＲＡＭ内の波形バッファの構成例を示すブロック図。
【図２４】ＭＩＤＩ演奏データに基づいて実行される楽音生成処理の概略を示すタイムチャート。
【図２５】奏法シーケンス（アーティキュレーションエレメントシーケンスＡＥＳＥＱ）のデータに基づいて実行される奏法演奏処理（アーティキュレーションエレメント楽音合成処理）の概略を示すタイムチャート。
【図２６】図２２のＣＰＵが実行する楽音合成処理のメインルーチンを示すフローチャート。
【図２７】図２６における「自動演奏処理」の一例を示すフローチャート。
【図２８】図２６における「音源処理」の一例を示すフローチャート。
【図２９】図２８における「通常演奏」についての「１フレーム分の波形データ生成処理」の一例を示すフローチャート。
【図３０】図２８における「奏法演奏」についての「１フレーム分の波形データ生成処理」の一例を示すフローチャート。
【図３１】時間軸伸縮処理（ＴＳＣ制御）の概略を概念的に示す図。
【図３２】奏法シーケンスの階層化構造を説明する図。
【図３３】時間軸伸縮制御によって時間軸圧縮する場合の波形読出アドレスの時間的進行状態の一例を示す図。
【図３４】時間軸伸縮制御によって時間軸伸張する場合の波形読出アドレスの時間的進行状態の一例を示す図。
【符号の説明】
ＡＤＢアーティキュレーションデータベース
ＴＤＢテンプレートデータベース
１０ＣＰＵ
１１ＲＯＭ（リードオンリーメモリ）
１２ＲＡＭ（ランダムアクセスメモリ）
１３ハードディスク装置
１４，１５リムーバブルディスク装置
１６表示器
１７キーボード及びマウス等の入力操作装置１７
１８波形インタフェース
１９タイマ
２０ネットワークインタフェース
２１ＭＩＤＩインタフェース
２２データ及びアドレスバス[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a technique capable of synthesizing a high-quality musical sound waveform having an articulation, and more particularly to a method and an apparatus for editing musical sound data in that case, and is not limited to electronic musical instruments, but also to game machines and personal computers. The present invention can be widely applied as a musical sound data editing apparatus and / or method in musical sound or sound generating equipment for various uses such as a computer and other multimedia equipment.
In this specification, “musical sound” is not limited to the sound of music, but is used in a broad sense including general sounds, such as human voices, various sound effects, and sounds in the natural world. I do.
[0002]
[Prior art]
In a sound source of a waveform memory reading system (PCM: pulse code modulation system) used in electronic musical instruments and the like, data of one or more cycles of a waveform corresponding to a predetermined tone color is stored in a memory, and the waveform data is stored in the memory. Is repeatedly generated at a desired reading speed corresponding to a desired pitch (pitch) of a musical tone to be generated, thereby generating a continuous musical tone waveform. Also, the data of all the waveforms from the start to the end of the tone generation are stored in a memory, and the waveform data is read out at a desired read speed corresponding to a desired pitch (pitch) of the tone to be generated. In some cases, one sound is generated.
In a PCM tone generator of this type, when a waveform stored in a memory is read out as it is and is not generated as a musical tone, the pitch, volume, and the like must be changed to make the generated musical tone expressive. , And tone colors. With regard to the pitch, a pitch modulation effect such as vibrato or attack pitch is provided by appropriately modulating the reading speed according to an arbitrary pitch envelope. Regarding the volume, it is possible to add a volume amplitude envelope according to the required envelope waveform to the read waveform data, or to apply a tremolo effect or the like by periodically modulating the volume amplitude of the read waveform data. Done. As for the timbre, appropriate timbre control is performed by filtering the read waveform data.
[0003]
Also, continuous performance sounds (phrases) actually played live are collectively sampled and pasted (recorded) on one recording track, and the phrase waveforms pasted on a plurality of tracks are separately recorded. There is also known a multi-track sequencer that reproduces and sounds in combination with an automatic performance sound based on the sequence performance data.
A method of recording all musical tone waveform data of one tune actually played live as PCM data and simply reproducing the data is well known as a music recording method for a CD (compact disc).
[0004]
[Problems to be solved by the invention]
By the way, when a skilled player about a natural instrument such as a piano, a violin, or a saxophone plays a series of musical phrases with the musical instrument, the content of the performance sound is not limited to the fact that the content of the performance sound is played by the same instrument. , Not uniform, depending on the musical composition or the sensibility of the performer, for each sound, in the connection between sounds, or in the rising, sustaining or falling parts of the sound, etc. It is performed with slightly different "articulations". The existence of such "articulations" gives the listener the impression of a really good sound.
The method of recording all the music performances performed by a skilled musician as PCM waveform data, like the music recording system on a CD, can reproduce the real and high quality of the live performance. You can realistically reproduce street articulations. However, since it can only be used as a mere playback device for a fixed song (song as recorded), an interactive musical instrument or multimedia device that allows the user to freely create and edit sounds. It cannot be used as music creation technology.
[0005]
On the other hand, in the PCM sound source technology known for electronic musical instruments and the like, as described above, sound creation by the user is allowed, and the generated musical sound can have a certain expressive power. is there. However, it was insufficient to realize natural "articulation" in both sound quality and expressiveness. For example, in this type of PCM sound source technology, generally, the waveform data stored in the memory only stores a sample of a single tone played by a natural musical instrument, and thus the sound quality of the generated musical tone is limited. In particular, articulation or playing style of a connection between sounds during performance cannot be expressed with high quality. For example, in the case of a slur performance method in which the preceding sound is smoothly changed to the next sound, a conventional electronic musical instrument or the like simply changes the waveform data reading speed from the memory smoothly or adds a volume to the generated sound. It merely relies on techniques such as controlling the envelope, and has not been able to achieve articulation or playing techniques of a sound quality comparable to live performance of natural musical instruments. In addition, different articulations, such as the rising portion, of the same musical instrument having the same pitch, depending on the difference between song phrases, or even with the same song phrase, depending on the difference in performance opportunities, etc. In some cases, such subtle differences in articulation cannot be expressed by a PCM sound source technology known for electronic musical instruments and the like.
[0006]
Also, the control of the generated musical tones according to the performance expression is relatively monotonous in conventional electronic musical instruments and the like, and cannot be said to be sufficient. For example, it is known to perform musical tone control according to a performance touch such as a key, but also in that case, it is only possible to control a volume change characteristic or a tone filter characteristic according to the touch, For example, it has not been possible to freely control the tone characteristics for each partial section of the entire tone generation section from the rise to the fall of a tone. Further, with respect to tone color control of generated sounds, once one tone color is selected prior to performance, waveform data corresponding to the selected tone color is read from the memory, and thereafter various performance expressions are generated during tone generation. Therefore, the waveform data corresponding to the tone color is only variably controlled by a filter or the like, so that the tone color change according to the performance expression is not sufficient. The control envelope waveforms such as pitch and volume are set and controlled in terms of the shape of the envelope as a unit from a series of envelopes from rising to falling of the envelope, and operations such as partially changing the envelope can be freely performed. It has not been able to do it.
[0007]
On the other hand, in a method such as the above-mentioned multi-track sequencer, since only the live performance phrase waveform data was pasted, partial editing processing (partial replacement, characteristic control, etc.) of the phrase waveform could not be performed at all. However, this method cannot be used as an interactive musical sound creation technology that allows a user to freely create a sound in an electronic musical instrument, a multimedia device, or the like.
In addition, not only musical performance sounds but also general sounds existing in the natural world are rich in delicate "articulations" according to the passage of time. The "articulation" of existing sounds could not be controllably reproduced.
[0008]
The present invention has been made in view of the above-described points, and is intended for a case where a musical sound (including not only a musical sound but also other general sounds as described above) is generated using an electronic musical instrument or an electronic device. , Providing realistic reproduction of "articulation" and facilitating its control, and providing interactive high-quality music creation technology that allows users to freely create and edit sounds in electronic musical instruments and multimedia devices. It is another object of the present invention to provide a new tone data editing method and apparatus based on such technology.
In this specification, the term “articulation” is used in a generally known meaning, and includes, for example, “syllable”, “connection between sounds”, “a group of sounds”. Phrase), "partial features of sound", "pronunciation method", "playing style", "performance expression" and so on.
[0009]
[Means for Solving the Problems]
Claim 1 data Search The method includes, for each of a plurality of performance phrases with musical articulation, dividing one or more sounds constituting the performance phrase into a plurality of partial time intervals, and articulating each of the partial time intervals. A first database unit for storing an articulation element sequence for sequentially indicating a plurality of articulation elements, and a second database unit for storing template data representing partial sound waveforms corresponding to various articulation elements. For a musical tone database Search The method, In the first database unit, attribute information indicating a characteristic of the articulation element sequence is stored in association with the articulation element sequence, Desired playing technique Attribute information according to The first step to specify and the specified Attribute information And a second step of searching the first database unit for an articulation element sequence corresponding to the performance style.
[0010]
This allows the user to search for the presence of an articulation element sequence corresponding to the specified playing style by specifying the desired playing style when editing the music data. This has an excellent effect that a database can be searched for availability. Therefore, for example, if the desired playing technique is available in the musical tone database, it is read out and its contents are confirmed, and if it is different from the desired one, the contents are appropriately modified / changed to obtain the desired musical tone data. Can be made to be created. On the other hand, if the desired playing style is not available in the tone database, editing work such as newly creating tone data corresponding to the playing style and registering it in the tone database can be performed. As described above, the editing operation of the musical sound data can be advanced in a form suitable for the performance sensation such as designating a desired playing style, and it becomes easy for the user to use.
[0012]
Further, in the articulation element sequence edited in the fourth step, a fifth step of setting a connection method of each template data between the edited articulation element and an articulation element adjacent thereto is set. Further, it may be provided. According to this, as a result of the editing operation, the content of the template data representing the partial sound waveform corresponding to the articulation element to be edited is changed, so that the content of the template data between the articulation element adjacent thereto is changed. Where the connection of the template data may be lost, the setting of the connection method of the adjacent template data (that is, redefinition) in the fifth step makes it possible to smoothly connect the adjacent template data. In this case, there are a plurality of modes of connection for smoothly connecting the template data, and an appropriate connection mode can be selectively set from the modes.
[0013]
The technique of music data creation and music synthesis according to the present invention analyzes music articulation, performs music editing and synthesis processing in units of articulation elements, and models music articulation to produce music sounds. The synthesis is performed. Therefore, this technique will be referred to as SAEM (Sound Articulation Element Modeling) technique.
The present invention can be configured and implemented not only as a method invention, but also as an apparatus invention. Further, the present invention can be implemented in the form of a computer program, or can be implemented in the form of a recording medium storing such a computer program. Further, the present invention can be embodied in the form of a recording medium storing waveforms or tone data having a novel data structure.
[0014]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
[Example of creating a music database]
As described above, when a skilled player about any natural musical instrument such as a piano, a violin, or a saxophone plays a series of music phrases with the musical instrument, the content of the performance sound is, for example, that the music is played on the same musical instrument. However, it is not uniform, for each sound, in the connection between sounds, or in the rising part, the continuation part, or the falling part of the sound, depending on the musical composition or the sensitivity of the performer, etc. Depending on, it is performed with a slightly different "articulation". The existence of such "articulations" gives the listener the impression of a really good sound.
In the case of musical instrument performance, "articulation" generally appears as a reflection of "playing style" or "performance expression" by a player. Therefore, in the following description, it is to be noted in advance that the terms "performance technique" or "performance expression" and "articulation" may be used to indicate substantially the same meaning. For example, the “playing technique” includes various types such as a staccato, tenuto, slur, vibrato, tremolo, crescendo, decrescendo, and the like. When a performer plays a series of musical phrases with musical instruments, various playing styles are used in each performance phase according to the instructions in the musical score or according to one's own sensibilities, thereby producing "articulations" corresponding to each playing style.
[0015]
FIG. 1 shows an example of a procedure for creating a tone database according to the present invention.
The first step S1 is a step of sampling a series of performance sounds composed of one or more musical tones. Here, for example, a skilled performer of a certain natural musical instrument plays a predetermined series of musical phrases with the musical instrument. This series of performance sounds is picked up by a microphone, sampled according to a predetermined sampling frequency, and PCM-encoded waveform data for the entire performance phrase is obtained. This waveform data is high quality data excellent in music. For the purpose of explanation, FIG. 2A shows an example of a musical score of a series of music phrases played for sampling in step S1. The “playing style symbol” added above the score in FIG. 2A exemplarily shows the playing style of the music phrase shown in the score. Such a musical score with a “playing style symbol” is not indispensable at the time of sampling in this step S1. A musician plays the music phrase in accordance with a normal musical score, and then analyzes the sampled waveform data to determine the playing style in each performance phase according to the passage of time, and creates a musical score with such playing style symbols. You may do so. As will be described later, the music score with such a rendition style symbol is not useful for sampling in step S1, but rather, a general user can obtain a desired score from a database created based on the data sampled here. It is thought that it will greatly assist the general user in extracting data and connecting them to create a desired performance sound. However, in order to exemplify how the phrase shown in the score of FIG. 2A is performed, the meaning of the performance style symbols illustrated in FIG. 2A will be described here.
[0016]
The performance symbols of black circles drawn corresponding to the three notes in the first bar indicate the “star-cart” performance style, and the size of the black circle indicates the degree of volume.
The rendition style symbol drawn with the letters "Attack-Mid, No-Vib" corresponding to the next note describes the rendition style "medium attack, no vibrato".
The playing style symbol drawn with the characters "Atk-Fast, Vib-Soon-Fast, Release-Smoothly" corresponding to the notes connected by the slur in the latter half of the second measure is "Attack rises quickly, vibrato is Make it faster and release more smoothly. "
A playing style symbol consisting of an oval black circle in the third measure indicates a “tenuto” playing style. In the third measure, a performance style symbol indicating that the volume is gradually reduced and a performance style symbol indicating that vibrato is added to the end of the sound are also described.
As described above, it can be understood that various playing styles or performance expressions, that is, articulations are used even for a musical phrase having a length of about three measures.
Note that the manner of representing these performance style symbols is not limited to this, but it is sufficient that the performance style is expressed in some way. Symbols expressing a certain degree of playing style are also used in conventional music notation, but in practicing the present invention, it is desirable to employ more precise performance style symbols than ever before.
[0017]
In FIG. 1, the next step S2 is a step of dividing a series of sampled performance sounds into a plurality of time sections each having a variable length according to the characteristic of the performance expression (that is, articulation). This is completely different from the method of dividing and analyzing the waveform data at regular regular time frames as known by, for example, Fourier analysis. In other words, since the articulations present in a series of sampled performance sounds are diverse, the time range of the sound corresponding to each articulation is not a uniform time length but an arbitrary variable. Of length. Therefore, dividing a series of sampled performance sounds into a plurality of time sections according to the characteristics (ie, articulations) of the performance expression means that the length of each divided time section is variable. It becomes.
[0018]
FIGS. 2B, 2C, and 2D hierarchically illustrate an example of such a division of a time section. FIG. 2 (b) shows a relatively large lump of articulation (for convenience, this is called "articulation large unit", and is indicated by symbols AL # 1, AL # 2, AL # 3, AL # 4). Is shown. Such an articulation large unit may be divided into, for example, small phrasing units having a common rough performance expression. FIG. 2C shows an example in which one large unit of articulation (AL # 3 in the figure) is further divided into units during articulation (for convenience, indicated by symbols AM # 1 and AM # 2). ing. The articulation units AM # 1 and AM # 2 are roughly divided into, for example, one sound as a unit. FIG. 2D divides one unit during articulation (AM # 1 and AM # 2 in the figure) into further minimum units of articulation (for convenience, indicated by symbols AS # 1 to AS # 8). An example is shown. The minimum unit of articulation AS # 1 to AS # 8 is a part of a sound which has a different performance expression, typically an attack part, a body part (a relatively stable part showing a steady characteristic of a sound). ), Release section, and the connection between sounds.
[0019]
For example, AS # 1, AS # 2, and AS # 3 form an attack portion, a first body portion, and a second body portion of one sound (preceding sound of a slur) constituting unit AM # 1 during articulation. AS # 5, AS # 6, AS # 7, and AS # 8 correspond to each other and constitute one unit AM # 2 during the next articulation (sound following the slur). , The third body part, and the release part. The reason why there are a plurality of body parts such as the first and second body parts is that the articulations are different even if the body parts have the same sound (for example, the speed of vibrato or the like changes). In some cases, such cases are dealt with. AS # 4 corresponds to a part of connection between sounds due to a slur change. This part AS # 4 may be extracted from either one of the two articulation units AM # 1 and AM # 2 (the end part of AM # 1 or the beginning part of AM # 2). Alternatively, the portion AS # 4 of the connection between sounds due to such a slur change may be taken out as a unit during articulation from the beginning. In this case, the large articulation unit AL # 3 is divided into three articulation units, and the middle articulation unit, that is, the part of the connection between sounds, is directly articulated. This corresponds to the minimum unit AS # 4. When the portion AS # 4 of the connection between the sounds due to the slur change is taken out alone, the portion AS # 4 is also used for the portion connecting the other sounds and the sound, so that these portions are also used. Sounds can be connected by slurs.
[0020]
The minimum articulation units AS # 1 to AS # 8 shown in FIG. 2D correspond to a plurality of time sections divided in the process of step S2. Hereinafter, such a minimum unit of articulation is also referred to as an articulation element. Note that the way of dividing the minimum articulation unit is not limited to the above example, so the minimum articulation unit, that is, the articulation element does not always correspond to only the sound part.
[0021]
In FIG. 1, the next step S3 is to analyze the waveform data for each of the divided time sections (minimum articulation units AS # 1 to AS # 8, that is, articulation elements) for a plurality of predetermined tone elements, and perform the analysis. This is a step of generating data indicating the characteristics of each of the musical tone elements. The musical tone elements to be analyzed include, for example, waveform (tone), amplitude (volume), pitch (pitch), and time. These tone elements are the components (elements) of the waveform data in the time section and also the components (elements) of the articulation in the time section.
In the next step S4, the generated data indicating the characteristics of each element is stored in the database. In the database, these stored data are made available as template data at the time of musical tone synthesis.
An example of how to analyze these tone elements is as follows, and FIG. 3 shows an example of data (template data) indicating the characteristics of each tone element. FIG. 2E also shows examples of the types of each tone element analyzed from one minimum unit of articulation.
[0022]
{Circle around (1)} For the waveform (tone) element, the original PCM waveform data in the time section (articulation element) is taken out as it is. This is stored in a database as a waveform template (Timbre template). “Timbre” is used as a symbol indicating the waveform (tone color) element.
{Circle around (2)} With respect to the amplitude (volume) element, the volume envelope (change in volume amplitude over time) of the original PCM waveform data in the time section (articulation element) is extracted to obtain amplitude envelope data. This is stored in a database as an amplitude template (Amp template). "Amp" (abbreviation of Amplitude) is used as a symbol indicating the amplitude (volume) element.
{Circle around (3)} With respect to the pitch (pitch) element, the pitch envelope (pitch change with time) of the original PCM waveform data in the time section (articulation element) is extracted to obtain pitch envelope data. This is stored in a database as a pitch template (Pitch template). "Pitch" is used as a symbol indicating the pitch element.
[0023]
{Circle around (4)} For the time element, the time length of the original PCM waveform data in the time section (articulation element) is used as it is. Therefore, if the original time length (variable value) of the section is indicated by the ratio “1”, it is not necessary to analyze and measure this time length when creating the database. In this case, since the data of the time element, that is, the time template (TSC template) has the same value “1” in any section (articulation element), it is not necessary to store this in the template database. Of course, the present invention is not limited to this, and a modified example in which the actual time length is analyzed and measured and stored in a database as time template data is also possible.
[0024]
As a technique for variably controlling the original time length of the waveform data, control for expanding or compressing the waveform data in the time axis direction without affecting the pitch of the waveform data has not been disclosed yet, but "Time". It has already been proposed by the present inventors as "Stretch &Compress" control (abbreviated as "TSC control"). In the present embodiment, such “TSC control” is used, and the TSC used as a symbol of the time element is this abbreviation. At the time of musical tone synthesis, the time length of the reproduced waveform signal can be variably controlled by setting the TSC value to another appropriate value without fixing it to "1". In that case, the TSC value may be given as a time-varying value (for example, an appropriate time function such as an envelope). Note that this TSC control can be useful when, for example, the time length of a portion of the original waveform to which a special playing style such as vibrato or slur is applied is variably controlled.
[0025]
The processing as described above is performed for various natural musical instruments in various playing styles (for various music phrases), and a template for each musical tone element for a large number of articulation elements is created for each natural musical instrument. And store them in a database. Not only natural instruments but also various sounds that exist in the natural world, such as human voices and thunder, perform sampling and articulation analysis as described above. Template data may be stored in a database. Of course, the phrase to be played live for sampling is not limited to a phrase consisting of several measures as in the above example, but may be shorter if necessary (for example, one phrasing small unit as shown in FIG. 2B). ) Alone or, conversely, an entire song.
[0026]
The configuration of the database DB is roughly classified into a template database TDB and an articulation database ADB, for example, as shown in FIG. As the hardware of the database DB, a readable / writable storage medium (preferably a large-capacity medium) such as a hard disk device or a magneto-optical disk device is used as is well known.
The template database TDB stores a large number of template data created as described above. Note that all of the template data stored in the template database TDB does not necessarily need to be based on the sampling and analysis of the performance sound or natural sound as described above. In short, the template data is prepared in advance as a template (finished data). And may be created arbitrarily and artificially by a data editing operation. For example, the TSC template for the time element is normally "1" as described above as long as it is based on the sampled performance sound, but can be created with a free change pattern (envelope). Thus, various TSC values or envelope waveforms of the temporal change thereof may be created as TSC template data and stored in a database. Also, the type of template stored in the template database TDB is not limited to the type corresponding to the specific element analyzed from the original waveform as described above, and other types may be appropriately selected for convenience in synthesizing a musical tone. May increase. For example, when tone color control is performed by using a filter at the time of musical tone synthesis, a large number of filter coefficient sets (including time-varying filter coefficient sets) are prepared as template data, and these are stored in the template database TDB. May be. Of course, such a filter coefficient set may be created based on the analysis of the original waveform, or may be created by other appropriate means.
[0027]
The data configuration of each template data stored in the template database TDB is composed of data representing the content of each template data itself as illustrated in FIG. For example, the waveform (Timbre) template is the PCM waveform data itself. Also, envelope waveforms such as an amplitude (Amp) envelope, a pitch (Pitch) envelope, and a TSC envelope may be obtained by PCM-encoding the envelope shape. However, in order to compress the data storage configuration of the envelope waveform template in the template database TDB, parameter data for approximating the envelope waveform with a broken line (data indicating the slope rate of each broken line and a target level or time, etc., as is well known) These template data may be stored in the form of
[0028]
Also, the waveform (Timbre) template may be stored in an appropriate data compressed format other than the PCM waveform data. Further, the waveform, that is, the tone color (Timbre) template data may be stored in another appropriate data format. That is, the waveform (Timbre) template data may be waveform data having a data compression coded format other than the PCM format, such as DPCM or ADPCM, or waveform formation that does not directly indicate a waveform sample value. Data, that is, parameters for waveform synthesis. As a waveform synthesizing method using such parameters, Fourier synthesis, FM (frequency modulation) synthesis, AM (amplitude modulation) synthesis, physical model sound source or SMS waveform synthesis (waveform synthesis using deterministic component and uncertain component) ), Any of these methods may be employed, and the parameters for waveform synthesis for that may be stored in the database as waveform (Timbre) template data. In this case, the waveform forming process based on the waveform (Timbre) template data, that is, the waveform synthesizing parameters is, of course, performed by the corresponding waveform synthesizing arithmetic unit or program. In this case, a plurality of parameter sets for waveform synthesis for forming a waveform of a desired shape are stored in correspondence with one articulation element, that is, a time section, and a parameter set used for waveform synthesis is stored in time. By changing over as time passes, the time variation of the waveform shape within one articulation element may be realized.
[0029]
Further, even when a waveform (Timbre) template is stored as PCM waveform data, a known loop reading technique can be employed (for example, such a case that a tone color waveform is stable and does not change much time like a body portion). The waveform data for a portion) may be configured to store only a part of the waveform data without storing the entire waveform of the section. When the contents of template data for different time intervals obtained as a result of sampling and analysis, that is, for the articulation elements, are the same or similar, only one template data is stored in the database TDB without being stored in the database TDB. Is stored and shared at the time of musical tone synthesis, so that the storage amount of the database TDB can be saved. Further, the configuration of the template database TDB may include a preset area created in advance by a supplier of the basic database (for example, an electronic musical instrument maker), a user area that can be additionally created by a user, and the like.
[0030]
The articulation database ADB contains data describing articulations (ie, data describing a series of performances by a combination of one or more articulation elements) in order to construct a performance including one or more articulations. And data describing each articulation element) are stored in correspondence with various playing cases and playing styles.
The block in FIG. 4 illustrates a database configuration for a certain instrument sound named “Instrument 1”. The articulation element sequence AESEQ describes a performance phrase including one or a plurality of articulations (that is, an articulation performance phrase) in the form of sequence data sequentially indicating one or a plurality of articulation elements. Is what you do. For example, the articulation element sequence corresponds to the time-series order of the minimum unit of articulation (articulation element) as shown in FIG. 2D analyzed in the sampling and analysis process. It is. A large number of articulation element sequences AESEQ are stored so as to cover various possible playing styles when playing the instrument sound. Note that one articulation element sequence AESEQ is “small phrasing unit” (large articulation unit AL # 1, AL # 2, AL # 3, AL # 4) as shown in FIG. 2B. 2 or may consist of some of these "phrasing small units" (AL # 1, AL # 2, AL # 3, AL # 4), or FIG. It may be one of the “articulation units” (AM # 1, AM # 2) as shown in (c), or these “articulation units” (AM # 1, AM # 1). # 2) may be supported.
[0031]
The articulation element vector AEVQ indicates the index of template data for each musical element for all articulation elements prepared (stored) in the template database TDB for the musical instrument sound (Instrument 1). It is stored in the form of vector data indicating a template (for example, in the form of address data for extracting a required template from the template database TDB). For example, as shown in the examples of FIGS. 2D and 2E, corresponding to a certain articulation element AS # 1, each element (waveform) constituting a partial musical tone corresponding to the articulation element , Amplitude, pitch, and time) are stored as vector data (this is referred to as an element vector) that specifically indicates the four templates Timbre, Amp, Pitch, and TSC, respectively.
[0032]
In one articulation element sequence (reproduction style sequence) AESEQ, indexes of a plurality of articulation elements are described in the order of performance, and a set of templates constituting each articulation element described therein is: It can be derived by referring to the articulation element vector AEVQ.
FIG. 5A shows an example of some articulation element sequences AESEQ # 1 to AESEQ # 7. Explaining how to read this figure, for example, AESEQ # 1 = (ATT-Nor, BOD-Vib-nor, BOD-Vib-dep1, BOD-Vib-dep2, REL-Nor) is the sequence AESEQ # of sequence number 1 1 indicates that it consists of a sequence of five articulation elements ATT-Nor, BOD-Vib-nor, BOD-Vib-dep1, BOD-Vib-dep2, and REL-Nor. The meaning of the index symbol of each articulation element is as follows.
[0033]
ATT-Nor indicates "Normal Attack" (a playing style in which the attack portion stands up as a standard).
BOD-Vib-nor indicates "body normal vibrato" (a playing style in which a standard vibrato is attached to the body).
BOD-Vib-dep1 indicates "body vibrato depth 1" (a playing style in which a vibrato that is one step deeper than the standard is added to the body part).
BOD-Vib-dep2 indicates "Body Vibrato Depth 2" (a playing style in which a vibrato that is two steps deeper than the standard is attached to the body part).
REL-Nor indicates "normal release" (a playing style in which the release part falls down as a standard).
[0034]
Thus, the sequence AESEQ # 1 starts with a normal attack, with a normal vibrato applied first in the body, then a bit deeper, then a vibrato deeper, and finally a standard sound in the release. It consists of an articulation that shows a fall.
The articulation of the other sequences AESEQ # 2 to AESEQ # 6 shown by way of example can be similarly understood from the symbolic representation of the articulation element in FIG. 5A.
The meaning of the symbols of some other articulation elements shown in FIG. 5A will be described below for reference.
[0035]
BOD-Vib-spd1 indicates "body vibrato speed 1" (a playing style in which a vibrato one step faster than the standard is added to the body part).
BOD-Vib-spd2 indicates "body vibrato speed 2" (a playing style in which a vibrato two steps faster than the standard is attached to the body part).
BOD-Vib-d & s1 indicates "Body Vibrato Depth & Speed 1" (a playing style in which the depth and speed of the vibrato attached to the body part are each raised by one step from the standard).
BOD-Vib-bri indicates "body vibrato brilliant" (playing style in which vibrato is attached to the body portion and its tone is flashy).
BOD-Vib-mld1 indicates "Body Vibrato Mild 1" (playing style in which vibrato is added to the body and the tone is slightly milder).
BOD-Cre-nor indicates "body normal crescendo" (a technique of attaching a standard crescendo to the body).
BOD-Cre-vol1 indicates "body crescendo volume 1" (playing style in which the volume of the crescendo attached to the body part is raised by one level).
ATT-Bup-nor indicates "attack / bend-up normal" (playing method in which the pitch of the attack portion is bent up at a standard depth and speed).
REL-Bdw-nor indicates "release / bend down normal" (playing method in which the pitch of the release portion is bent down at a standard depth and speed).
[0036]
Thus, the sequence AESEQ # 2 begins with a normal attack, with a normal vibrato applied first in the body, then a slightly faster vibrato speed, then a further faster vibrato speed, and finally a standard in the release portion. It corresponds to articulation (playing technique) that shows a change that shows the falling of the sound.
Further, the sequence AESEQ # 3 corresponds to an articulation (playing style) that indicates that the vibrato depth is gradually increased and the speed is also gradually increased.
The sequence AESEQ # 4 corresponds to articulation (playing style) that changes the sound quality (timbre) of the waveform at the time of vibrato.
The sequence AESEQ # 5 corresponds to an articulation (playing style) with crescendo.
The sequence AESEQ # 6 corresponds to an articulation (playing style) in which the pitch of the attack portion is bed-up (the pitch gradually increases).
The sequence AESEQ # 7 corresponds to an articulation (playing style) in which the pitch of the release portion is bed-down (the pitch gradually decreases).
The articulation element sequence (reproduction style sequence) is not limited to the above, and may have many more types.
[0037]
FIG. 5B shows a configuration example of an articulation element vector AEVQ for some articulation elements. Explaining how to read this figure, vector data indicating a template corresponding to each element is described in parentheses. The leading symbol in each vector data indicates the type of the template. That is, Timb indicates a waveform (Timbre) template, Amp indicates an amplitude (Amp) template, Pit indicates a pitch (Pitch) template, and TSC indicates a time (TSC) template. It indicates that.
[0038]
For example, ATT-Nor = (Timb-A-nor, Amp-A-nor, Pit-A-nor, TSC-A-nor) is an articulation element ATT-Nor having a meaning of "normal attack". Tim-A-nor (standard waveform template for attack part), Amp-A-nor (standard amplitude template for attack part), Pit-A-nor (standard pitch template for attack part), TSC- This indicates that the waveform is synthesized by four templates called A-nor (standard TSC template of the attack part).
[0039]
As another example, the articulation element BOD-Vib-dep1 having the meaning of "body vibrato depth 1" is Timb-B-vib (a waveform template for vibrato of the body part) and Amp-B-dp3. (Amplitude template for vibrato depth 3 of body part), Pit-B-dp3 (pitch template for vibrato depth 3 of body part), TSC-B-vib (TSC template for vibrato of body part) The waveform is synthesized by the two templates.
As another example, the articulation element REL-Bdw-nor having the meaning of "release bend down normal" is composed of Timb-R-bdw (a waveform template for bend down of a release section) and Amp-R. -Bdw (amplitude template for release section benddown), Pit-R-bdw (pitch template for release section benddown), and TSC-R-bdw (TSC template for release section benddown). The waveform is synthesized by the template.
[0040]
In order to facilitate the editing of the articulation elements, attribute information ATR that roughly describes the characteristics of each articulation element sequence is stored in association with each articulation element sequence AESEQ. Good to do. Similarly, it is preferable to store attribute information ATR that roughly describes the characteristics of each articulation element, attached to each articulation element vector AEVQ.
In short, such attribute information ATR describes the characteristics of each articulation element (the minimum unit of articulation as shown in FIG. 2D). An example of an articulation element related to an attack portion is shown, showing an example of a symbol (index) of the articulation element, contents of each attribute information ATR, and each vector data indicating a template of each musical tone element. 6 is shown.
[0041]
In the example of FIG. 6, the attribute information ATR is also managed in a hierarchical manner. That is, all the articulation elements related to the attack part are provided with the common attribute information of “attack”, and the standard element is further provided with the attribute information of “normal”. The element to which the bend-up performance is applied is provided with attribute information "bend-up", and the element to which the bend-down performance is applied is provided with attribute information "bend-down". Further, among the elements to which the bend-up playing method is applied, the attribute information “normal” is added to a standard element, and “depth / shallow” is applied to an element having a bend depth shallower than the standard. Attribute information is added, and if the bend depth is deeper than the standard, the attribute information "depth / deep" is added. If the bend speed is slower than the standard, the attribute is "speed / slow" Information is given, and the attribute information of “speed / fast” is given to a bend whose speed is faster than the standard. Although illustration is omitted, similarly subdivided attribute information is added to the element to which the bend down playing style is applied.
[0042]
FIG. 6 also shows that some template data is shared between different articulation elements. In FIG. 6, the vector data (in other words, the template index) of the four templates described in the column of each index (articulation element index) of the playing style is used to form a partial sound of the articulation element. 5 shows the vector data designating the template, and the reading method is the same as that in FIG. 5B. Here, in an element having a bend-up attribute, an element with an = sign means that the same template as that in the normal state is used. For example, the same bend-up performance waveform template (Timbre) as the bend-up normal waveform template Timb-A-bup is used. Also, the same amplitude (Amp) template for the bend-up playing technique as the amplitude template Amp-A-bup for the bend-up normal is used. This is because, even if the bend-up playing style is slightly changed, it is possible to use a common one without changing the waveform and the amplitude envelope in terms of sound quality. On the other hand, it is necessary to use different pitch templates according to the depth of the bend-up playing technique. For example, in the articulation element ATT-Bup-dp1 having the attribute of “depth / shallow”, in order to designate a pitch template (pitch envelope template corresponding to a shallow bend-up characteristic) corresponding thereto, Vector data Pit-A-dp1 indicating a pitch envelope template corresponding to a shallow bend-up characteristic is used.
[0043]
By sharing the template data in this manner, the storage amount of the template database TDB can be saved. Also, when creating a database, it is not necessary to record live performances for all playing styles.
Referring to FIG. 6, it can be understood that the speed of the bend-up playing technique is adjusted by changing the time (TSC) template. Since the pitch bend speed corresponds to the time required to reach the target pitch from a predetermined initial pitch, the original waveform data has a predetermined pitch bend characteristic (bend from a predetermined initial pitch to a target pitch within a certain time). If the time length of the original waveform data is variably controlled by the TSC control, the time required to reach the target pitch from the initial pitch, that is, the bend speed can be adjusted. Such variable control of the waveform time length using the time (TSC) template is suitable for adjusting the speed of various playing styles, such as the speed of tone rise, the speed of slurs, and the speed of vibrato. For example, a change in pitch in a slur can be realized by a pitch (Pitch) template, but a natural slur change can be realized by performing TSC control using a time (TSC) template.
[0044]
The articulation element vector AEVQ in the articulation database ADB can be, of course, addressed by the articulation element index, and can be addressed by the attribute information ATR. As a result, by searching the articulation database ADB using the desired attribute information ATR as a keyword, it is possible to search for what articulation element having an attribute corresponding to the keyword, It is convenient for the user to edit data. Such attribute information ATR may be added to the articulation element sequence AESEQ. Thus, by searching the articulation database ADB using the desired attribute information ATR as a keyword, it is possible to search for an articulation element sequence AESEQ including an articulation element having an attribute corresponding to the keyword. it can.
The articulation element index for addressing the articulation element vector AEVQ in the articulation database ADB is given according to the reading of the articulation element sequence AESEQ. It is preferable that the desired articulation element index can be individually input for work or for real-time free sound creation.
[0045]
The articulation database ADB also has an area for storing a user articulation element sequence URSEQ so that a user can create and store a desired articulation element sequence. In such a user area, articulation element vector data created by the user may be stored.
The articulation database ADB stores a partial vector PVQ as vector data lower than the articulation element vector AEVQ. When the template data specified by the articulation element vector AEVQ is stored in the template database TDB not as data of the entire time section of the articulation element but as a part of data, The template data is loop-read (repeatedly read) to reproduce the data of the articulation element in all time sections. The data required for such a loop read is stored as a partial vector PVQ. In this case, for example, the articulation element vector AEVQ stores data indicating the partial vector PVQ in addition to the template data, and reads out the data of the partial vector PVQ by the partial vector indicating data. The loop reading is controlled by the data of the vector PVQ. Therefore, the partial vector PVQ includes data indicating a loop start address, a loop end address, and the like necessary for loop read control.
[0046]
Further, the articulation database ADB stores rule data RULE that describes rules for connecting waveform data between temporally adjacent articulation elements during musical sound synthesis. For example, if the waveforms are cross-fade interpolated between temporally adjacent articulation elements and connected smoothly, or if they are directly connected without cross-fade interpolation, or if you want to use cross-fade waveform interpolation Are stored for each sequence or for each articulation element in the sequence. This connection rule can also be targeted for data editing by the user.
In the articulation database ADB, an articulation database having a data structure as exemplified above is provided for each musical instrument sound (natural musical instrument tone), and various human voices (young female voice, young female voice, It is provided for each male voice, baritone, soprano, etc., and is provided for each of various natural sounds (thunder, sound of waves, etc.).
[0047]
[Outline of music synthesis]
FIG. 7 shows an outline of a procedure for synthesizing a musical tone using the database DB created as described above.
First, a required rendition style sequence corresponding to a musical tone performance to be generated (a performance phrase composed of a plurality of sounds or a single tone) is designated (step S11). The instruction of the rendition style sequence is performed by selectively indicating one of the articulation element sequences AESEQ or URSEQ of a desired musical instrument sound (or a human voice sound or a natural sound) stored in the articulation database ADB. It may be.
[0048]
The instruction of such a rendition style sequence (that is, an articulation element sequence) may be provided based on a real-time performance operation by a user, or may be provided based on automatic performance data. It may be. In the former case, for example, various rendition style sequences are assigned in advance to a keyboard and other performance operators, and the rendition style instruction data assigned thereto is generated in accordance with the operation of the operators. can do. In the latter case, as one technique, as schematically shown in FIG. 8A, the playing style sequence instruction data is incorporated as event data into the automatic performance sequence data in the MIDI format or the like corresponding to the desired music. It is possible to store the rendition style instruction data at the time of predetermined event reproduction during automatic performance reproduction. In FIG. 8, DUR is duration data indicating a time interval until the next event, EVENT is event data, MIDI is performance data attached to the event data in MIDI format, and AESEQ is the event data. Indicates that the attached performance data is performance style sequence instruction data. In this case, an ensemble of the automatic performance based on the automatic performance data in the MIDI format or the like and the automatic performance based on the performance style sequence according to the present invention can be performed. In this case, for example, a form in which the main solo or melody playing instrument part is played by the playing style sequence according to the present invention, that is, articulation element synthesis, and the other instrument parts are performed by automatic performance based on MIDI data can be adopted. .
[0049]
As another method of the latter, as schematically shown in FIG. 8B, only a plurality of rendition style sequence instruction data AESEQ are stored in an event data format corresponding to a desired music, and this is stored in a predetermined format. May be read at the time of reproduction of each event. This makes it possible to perform an articulation sequence automatic performance of a musical piece, which has not been available in the past.
Furthermore, as another method of the latter, only automatic performance sequence data of a MIDI format or the like corresponding to a desired music piece is stored, and the automatic performance sequence data is analyzed by a performance interpretation program, so that each phrase or note is analyzed. , The articulation may be automatically analyzed, and the rendition style instruction data may be generated as a result of the analysis.
Further, as another method of instructing the rendition style sequence, the user inputs one or more pieces of desired attribute information, and searches the articulation database ADB using the information as one of a plurality of pieces of articulation elements. The sequence AESEQ may be automatically listed, and a desired sequence may be selected and designated from the list.
[0050]
In FIG. 7, in the selected articulation element sequence AESEQ or URSEQ, an articulation element (AE) index is read out according to a predetermined performance order (step S12).
Then, an articulation element vector (AEVQ) corresponding to the read articulation element (AE) index is read (step S13).
Then, each template data designated by the read articulation element vector (AEVQ) is read from the template database TDB (step S14).
[0051]
Then, waveform data (partial sound) of one articulation element (AE) is synthesized according to each read template data (step S15). The method of synthesizing the waveform is basically such that the PCM waveform data corresponding to the waveform (Timbre) template data is read from the template database TDB at a read speed according to the pitch (Pitch) template and at a time length according to the time (TSC) template, The amplitude envelope of the read PCM waveform data is controlled according to an amplitude (Amp) template. In this embodiment, since the waveform (Timbre) template data stored in the template database TDB has the pitch, amplitude envelope, and time length of the sampled original waveform as they are, the pitch (Pitch) template, the amplitude ( If the Amp) template and the time (TSC) template were not changed from those of the sampled original waveform, the PCM waveform data corresponding to the waveform (Timbre) template data stored in the template database TDB was read as it is. This becomes the waveform data for the articulation element. When any one of the pitch (Pitch) template, the amplitude (Amp) template, and the time (TSC) template is changed from that of the sampled original waveform by data editing or the like described later, according to the change, The reading speed of the waveform (Timbre) template data stored in the template database TDB is variably controlled (when the pitch template is changed), and the reading time length is variably controlled (when the time template is changed). The amplitude envelope for the readout waveform is variably controlled (when the amplitude template is changed).
When the above-described partial vector PVQ is applied to the articulation element AE, necessary loop read control is also performed.
[0052]
Next, a process of sequentially connecting the waveform data of each articulation element synthesized as described above is performed, and as a result, a series of performance sounds composed of a time series combination of a plurality of articulation elements is generated. Is performed (step S16). The connection process here is controlled according to rule data RULE stored in the articulation database ADB. For example, when the rule data RULE instructs direct connection, the waveform data of each articulation element synthesized in step S15 may simply be sequentially switched in accordance with the generation order and sounded. When the rule data RULE indicates a predetermined cross-fade interpolation, the waveform data at the end of the preceding articulation element and the waveform at the beginning of the following articulation element are in accordance with the specified interpolation format. Is cross-fade-interpolated and synthesized with the waveform data, so that the waveforms are smoothly connected. For example, if the sampled original waveforms are connected as they are, since the articulation elements are originally guaranteed to be connected smoothly, the rule data RULE may indicate a direct connection. In other cases, it is not guaranteed that articulation elements are smoothly connected to each other, so it is preferable to perform some kind of interpolation synthesis. As described later, any of a plurality of types of cross-fade interpolation formats can be arbitrarily selected by the rule data RULE.
[0053]
A series of performance sound synthesis processing as schematically shown in steps S11 to S16 is performed for one musical instrument sound (or human voice sound or natural sound) in one musical sound synthesis channel. When the performance sound synthesis processing for a plurality of instrument sounds (or human voice sounds or natural sounds) is performed simultaneously and in parallel, a series of performance sound synthesis processing as schematically shown in steps S11 to S16 is performed in a time-division manner on a plurality of channels. Or in parallel. As will be described later, when a musical tone waveform is formed using cross-fade synthesis processing, two waveform generation channels (a channel for generating a fade-out waveform and a waveform for fading in) are generated for one tone synthesis channel. Channel).
[0054]
FIG. 9 schematically shows examples of combinations of articulation elements in some rendition style sequences. A rendition style sequence # 1 shown in (a) shows an example of the simplest combination, in which an articulation element A # 1 in an attack portion, an articulation element B # 1 in a body portion, and an articulation element in a release portion. R # 1 are sequentially connected, and the connection between the elements is cross-fade interpolated. A playing style sequence # 2 shown in (b) shows an example of an articulation combination in which a decorative sound is added before the main sound, and the articulation element A # 2 of the decorative sound attack portion and the decorative sound for the decorative sound. Articulation element B # 2 of the body part, articulation element A # 3 of the attack part for the main sound, articulation element B # 3 of the body part for the main sound, articulation of the release part for the main sound The curation elements R # 3 are sequentially connected, and the connection between the elements is cross-fade interpolated. The rendition style sequence # 3 shown in (c) shows an example of an articulation combination in which the preceding sound and the following sound are connected by a slur. The articulation element A # 4 of the attack portion for the preceding sound and the articulation element A # 4 for the preceding sound Articulation element B # 4 in the body part, articulation element B # 5 in the body part for the slur partial sound, articulation element B # 6 in the body part for the subsequent sound, articulation in the release part for the subsequent sound The curation elements R # 6 are sequentially connected, and the connection between the elements is cross-fade interpolated. In the drawing, the partial sound waveform corresponding to each articulation element is schematically shown only by an envelope for convenience, but actually, as described above, the waveform (Timbre), the amplitude (Amp), and the pitch (Pitch) are used. , Time (TSC) template data.
[0055]
FIG. 10 is a time chart showing a specific example of a process of sequentially generating partial sound waveforms corresponding to a plurality of articulation elements and cross-fading them in one tone synthesis channel. Specifically, two waveform generation channels are used to cross-fade synthesize two element waveforms per tone synthesis channel. FIG. 10A shows an example of waveform generation in the first waveform generation channel, and FIG. 10B shows an example of waveform generation in the second waveform generation channel. In (a) and (b), the “synthesized waveform data” shown in the upper part of each of the waveforms (Timbre) and amplitude (Amp) as described above is a partial sound waveform corresponding to the articulation element. ), Pitch (Pitch), time (TSC), and the like, and waveform data synthesized based on each template data (for example, waveform data synthesized in step S15 of FIG. 7). “Cross-fade control waveform” indicates a control waveform used for cross-fading partial sound waveforms corresponding to each element. This “cross-fade control waveform” is formed in the process of step S16 in the flow of FIG. 7, for example. By controlling the amplitude of the upper-stage element waveform data by the lower cross-fade control waveform of each channel and adding the cross-fade amplitude-controlled waveform data of each channel (first and second waveform generation channels), Crossfade synthesis is completed.
[0056]
When one rendition style sequence is started, a sequence start trigger SST is given, and in response to this, synthesis of a partial sound waveform corresponding to the first articulation element of the sequence (provisionally A # 1) is started. . That is, waveform data is synthesized based on template data such as a waveform (Timbre), amplitude (Amp), pitch (Pitch), and time (TSC) for the articulation element. Therefore, in the figure, “synthesized waveform data” is simply indicated by a block, but actually, a waveform corresponding to the waveform (Timbre) template data, an amplitude envelope corresponding to the amplitude (Amp) template data, and It has a pitch corresponding to the pitch (Pitch) template data, its temporal change, and a time length corresponding to the time (TSC) template data.
The rise of the crossfade control waveform may be such that the first articulation element waveform of the sequence immediately rises at the full level as shown. However, if it is desired to perform crossfade synthesis with the waveform at the end of the performance sound of the preceding sequence, the rising edge of the first crossfade control waveform in the sequence may have a fade-in characteristic with an appropriate slope. The gradient of the fade-in is set by the fade-in rate FIR # 1.
[0057]
Corresponding to the first articulation element A # 1 of the sequence, the connection control information includes the fade-in rate FIR # 1, the next channel start point information NCSP # 1, the fade-out start point information FOSP # 1, and the fade-out. Rate FOR # 1. Next channel start point information NCSP # 1 indicates a point at which waveform generation of the next articulation element (for example, B # 1) is started. Fade-out start point information FOSP # 1 indicates a point at which fade-out of its own waveform is started. As shown in the figure, the cross-fade control waveform indicates the full level flat until the fade-out start point, but after the fade-out start point, the level gradually increases with a slope according to the set fade-out rate FOR # 1. Fall. When the rule data RULE corresponding to the element A # 1 indicates direct connection without cross-fade connection, these pieces of information NCSP # 1 and FOSP # 1 are combined with the combined articulation. The end of the element waveform may be indicated. However, when the corresponding rule data RULE indicates direct connection without cross-fade connection, these pieces of information NCSP # 1 and FOSP # 1 are located at the end of the articulation element waveform as shown in the figure. Also indicate the previous appropriately set points respectively. Therefore, it can be considered that these pieces of information NCSP # 1, FOSP # 1, FIR # 1, and FOR # 1 are included in the rule data RULE for the element A # 1. Note that these pieces of connection control information are provided for each articulation element.
[0058]
When the generation process of the element waveform A # 1 in the first waveform generation channel shown in FIG. 10A reaches the point indicated by the next channel start point information NCSP # 1, the next channel start trigger NCS # 1 is activated. This is given to the second waveform generation channel shown in FIG. 10B, and the generation of the partial sound waveform corresponding to the second articulation element B # 1 is started in the second waveform generation channel. Further, the crossfade control waveform corresponding to the articulation element B # 1 fades in (increases gradually) at the gradient set by the corresponding fade-in rate FIR # 2. Thus, the fade-out period of the preceding articulation element A # 1 and the fade-in period of the following articulation element B # 1 overlap, and the cross-fade synthesis is completed by adding both.
After the waveform data of the preceding articulation element A # 1 fades out, only the succeeding articulation element B # 1 remains. In this way, the preceding articulation element A # 1 is cross-fade from the following articulation element B # 1 and the waveform is smoothly connected.
[0059]
When the generation process of the element waveform B # 1 in the second waveform generation channel shown in FIG. 10B reaches the point indicated by the fade-out start point information FOSP # 2, as shown in the figure, the cross-fade control waveform Is a slope according to the set fade-out rate FOR # 2, and its level gradually falls. When the generation process of the element waveform B # 1 reaches the point indicated by the next channel start point information NCSP # 2, the next channel start trigger NCS # 2 generates the first waveform shown in FIG. Given to the channel, and starts generating a partial sound waveform corresponding to the third articulation element R # 1 in the first waveform generation channel. In addition, the crossfade control waveform corresponding to the articulation element R # 1 fades in (gradually rises) at the gradient set by the corresponding fade-in rate FIR # 3. In this manner, the fade-out period of the preceding articulation element B # 1 and the fade-in period of the following articulation element R # 1 overlap, and the cross-fade synthesis is completed by adding both.
Hereinafter, similarly, each articulation element is connected in chronological order of the sequence while sequentially cross-fading.
[0060]
In the above example, crossfade synthesis is performed on element waveforms synthesized based on each template. However, the invention is not limited to this, and the crossfade processing may be performed for each template data, and the element waveforms may be synthesized based on the template data that has been subjected to the crossfade processing. In this case, different connection rules can be applied to each template even for the same element. That is, each of the above connection control information (fade-in rate FIR, next channel start point NCSP, fade-out start point FOSP, fade-out rate FOR) includes the waveform (Timbre), amplitude (Amp), pitch (Pitch), and time of the element. (TSC) etc. are prepared for each template corresponding to each tone element. In this way, crossfade connection can be performed for each template according to the optimum connection rule according to the template, which is effective.
[0061]
[Edit]
FIG. 11 schematically shows an example of the data editing process. In FIG. 11, a certain articulation element A # 1 having an attribute of an attack portion, a certain articulation element B # 1 having an attribute of a body portion, and a certain articulation having an attribute of a release portion are provided. An example is shown in which editing is performed based on data of an articulation element sequence AESEQ # x composed of an element R # 1. Of course, when performing the data editing described here, the computer executes a required editing program, and performs a desired operation with a keyboard or a mouse while observing the state of various data displayed on the display. It is implemented using any suitable realization means.
The base sequence AESEQ # x can be selected from a number of sequences AESEQ (for example, see FIG. 5A) stored in the articulation database ADB. Editing of articulation data is roughly divided into replacement, addition or deletion of articulation elements in a sequence, and replacement of a template in an element or creation of a new template by modifying data values of an existing template.
[0062]
In the edit column of FIG. 11, the articulation element R # 1 of the release part in the base sequence AESEQ # x has a relatively gently falling amplitude envelope characteristic, which falls relatively quickly. An example in which an element R # x having an amplitude envelope characteristic is replaced is shown. Not only replacement but also addition of a desired element (for example, addition of a body part element or addition of an element for a decorative sound) and deletion (if there are a plurality of body parts, any of them can be deleted) is also possible. is there. The element R # x used for replacement can be selected from a number of articulation element vectors AEVQ (see, for example, FIG. 5B) stored in the articulation database ADB. In this case, a desired element R # x to be used for replacement can be selected from an element group having the same attribute with reference to the attribute information ATR.
[0063]
Next, the template data corresponding to the desired tone element in the desired element (for example, the replaced element R # x) is replaced with another template data relating to the tone element. The example of FIG. 11 shows that the pitch (Pitch) template of the element R # x is replaced with another pitch template Pitch ′ (for example, a pitch template having a pitch bend characteristic). As a result, the created element R # x 'of the new release section has an amplitude envelope characteristic that falls relatively quickly and a pitch bend down characteristic. In the case of replacing a template, referring to the attribute information ATR, a template (vector data) of a group of elements having the same attribute in a large number of articulation element vectors AEVQ (for example, FIG. A desired template (vector data) to be used for replacement can be selected.
The new element R # x ′ created by replacing some of the templates is given an articulation element vector AEVQ (see FIG. 4) of the articulation database ADB by adding a new index and required attribute information. It is advisable to additionally register in the area of)
[0064]
It is also possible to modify the specific data content of the desired template. In this case, the specific data content of the desired template for the element being edited is read out from the template database TDB, displayed on a display or the like, and the data content is appropriately changed by operating a keyboard, a mouse, or the like. When the desired data correction is completed, the corrected template data is newly registered in the template database TDB with a new index, and new vector data is assigned to the corrected template data. A new index and required attribute information are added to a new element including data (for example, R # x ′) and additionally registered in the area of the articulation element vector AEVQ (see FIG. 4) of the articulation database ADB. It is good to do it.
[0065]
As described above, the data editing process of creating new sequence data by appropriately changing the contents of the base sequence AESEQ # x can be performed. New sequence data created by such a data editing process is given a new sequence number (for example, URSEQ # x) and attribute information as a user articulation element sequence URSEQ, and registered in the articulation database ADB. Thereafter, at the time of musical sound synthesis, data of the user articulation element sequence URSEQ can be read from the articulation database ADB using the sequence number URSEQ # x.
Note that the form of data editing is not limited to the example illustrated in FIG. 11, and there may be various forms. For example, a desired element may be sequentially selected from the element vector AEVQ without calling up the base sequence AESEQ, thereby creating the user sequence URSEQ.
[0066]
FIG. 12 is a flowchart showing an outline of a computer program capable of executing the data editing processing as described above.
In step S21, a desired performance style is specified. This designation may be made by directly inputting the sequence AESEQ or URSEQ number using a computer keyboard or mouse, or by inputting desired musical instrument timbre and attribute information.
In the next step S22, a search is made as to whether a sequence that matches the specified rendition style exists in the AESEQ or URSEQ in the articulation database ADB, and the corresponding sequence AESEQ or URSEQ is selected. In this case, when the sequence AESEQ or URSEQ number is directly input, the corresponding one is directly extracted. When the attribute information is input, a sequence AESEQ and / or URSEQ corresponding to the attribute information is searched. A plurality of pieces of attribute information can be input. If a plurality of pieces of attribute information are input, for example, the search may be performed using AND logic. Of course, the search is not limited to this and may be performed using OR logic. The search results are displayed on a computer display, and when a plurality of sequences AESEQ and / or URSEQ are searched, a desired one can be selected.
[0067]
In step S23, the user is inquired whether or not to continue the editing operation. If the content of the sequence selected or searched in step S22 is as desired and there is no need for editing, the editing process ends. If the user wants to continue the editing process, YES is determined in step S23, and the process proceeds to step S24. In addition, when the performance corresponding to the performance style specified in step S22 cannot be searched, it is determined that the continuation is YES in step S23, and the process proceeds to step S24.
An example of a search based on attribute information will be described with reference to a case where data as shown in FIGS. 5 and 6 is stored in the articulation database ADB. For example, assume that “attack bend-up normal”, “body normal”, and “release normal” are input as attributes of the search condition of the articulation sequence. In this case, since the attribute matches the attribute of the sixth sequence AESEQ # 6 shown in FIG. 5A, the sequence AESEQ # 6 is searched and selected in step S22. If this is satisfactory, the determination is NO in step S23, and the editing process ends. If the user wants to continue the editing process, YES is determined in the step S23, and the process proceeds to the step S24.
[0068]
In step S24, if a sequence corresponding to the performance style designated in step S21 has not been selected yet, a sequence closest to it is selected. For example, it is assumed that “attack bend-up normal”, “vibrato normal”, and “release normal” are input in step S21 as the attributes of the search condition of the articulation sequence. If there are only seven types of sequence AESEQ as shown in FIG. 5A, a sequence satisfying these conditions cannot be searched, and in step S24, the closest sequence AESEQ # 6 is selected.
In step S25, a process of replacing vector data (index) indicating a desired articulation element (AE) in the selected sequence with vector data (index) indicating another articulation element is performed. For example, in the case of the above example, the element configuration of the sequence AESEQ # 6 selected as the closest sequence in step S24 is composed of three element vectors ATT-Nor, BOD-Nor, and REL-Nor (see FIG. 5 (a)), the body part element BOD-Nor (normal body) may be replaced with a vibrato body part element. For that purpose, with reference to the articulation element vector AEVQ (for example, FIG. 5B), the element vector data (index) of BOD-Vib-nor (body normal vibrato) is extracted, and this is referred to as BOD-Nor. Replace it.
[0069]
Addition and deletion of articulation elements are also performed in step S25 as necessary. After the replacement, addition, and / or deletion of the desired element vector data is completed, a new articulation element sequence is created (step S26).
Since the replacement of the articulation elements and / or the addition or deletion of the articulation elements does not guarantee the connection of the waveform between the elements in the newly created articulation element sequence, the connection rule data RULE is set in the next step S27. Set. In the next step S28, it is checked whether the set connection rule data RULE is sufficient. If not, the process returns to step S27, and the connection rule data RULE is set again. If the set connection rule data RULE is OK, the process proceeds to step S29.
In step S29, an inquiry is made as to whether to continue the editing process. If the editing process is not to be continued, the process proceeds to step S30, and the newly created articulation element sequence is registered as a user sequence URSEQ in the articulation database ADB. If the editing process is to be continued, the determination in step S29 is YES, and the process proceeds to step S24 or S31. In this case, if it is desired to return to replacement and / or addition / deletion of the articulation element, the process returns to step S24, and if it is desired to proceed to editing of template data, the process proceeds to step S31.
[0070]
In step S31, an articulation element (AE) whose template data is to be edited is selected. In the next step S32, the template vector data corresponding to the desired tone element in the selected articulation element (AE) is replaced with another template vector data relating to the tone element.
For example, "attack bend-up normal", "slightly slow vibrato", and "release normal" are designated and input in step S21 as attributes of the search condition of the articulation sequence, and are shown in FIG. It is assumed that AESEQ # 6 is selected in step S24 as the closest sequence among the performed sequences AESEQ. As described above, since the element for the body part of this sequence AESEQ # 6 is BOD-Nor (normal body), this element is used in step S25 to convert the element of the body part for vibrato, for example, BOD-Vib-nor (body normal vibrato). ). Then, in step S31, the element of this BOD-Vib-nor (body normal vibrato) is selected, and this is set as an object to be edited. Then, in order to realize the desired “slightly slow vibrato”, in step S32, of the template vectors of the BOD-Vib-nor (body normal vibrato), the time template vector TSC-B-vib is set to the vibrato speed. Is replaced with a vector of a time template (for example, TSC-B-sp2) that makes the time slightly slower.
[0071]
Thus, a new articulation element in which the time template vector is changed from TSC-B-vib to TSC-B-sp2 among the templates of BOD-Vib-nor (body normal vibrato) is created (step S33). ). Further, a new articulation element sequence is created by replacing the element for the body part of the sequence AESEQ # 6 with the newly created articulation element (step S33).
Subsequent steps S34, S35, and S36 are the same as steps S27, S28, and S29 described above. That is, the connection of the waveforms between the elements in the newly created articulation element sequence is not guaranteed by the replaced template data, so that the connection rule data RULE is set again as described above.
[0072]
In step S36, an inquiry is made as to whether to continue the editing process. If the editing process is not to be continued, the process proceeds to step S37, and the newly created articulation element (AE) is registered in the articulation database ADB as a user articulation element vector (AEVQ). If the editing process is to be continued, the determination in step S36 is YES, and the process proceeds to step S31 or S38. In this case, if it is desired to return to the replacement of the template vector, the process returns to step S31. If it is desired to proceed to the editing of the specific contents of the template data, the process proceeds to step S38.
In step S38, a template in a required articulation element (AE) whose data content is to be edited is selected. In the next step S39, the data of the selected template is read from the template database TDB, and the specific data content is changed as appropriate.
[0073]
For example, “attack bend-up normal”, “quite slow vibrato”, and “release normal” are designated and input in step S21 as attributes of the search condition of the articulation sequence, and are shown in FIG. Assume that AESEQ # 6 is selected in step S24 as the closest sequence among the sequences AESEQ that have been performed. As described above, since the element for the body part of this sequence AESEQ # 6 is BOD-Nor (normal body), this element is used in step S25 to convert the element of the body part for vibrato, for example, BOD-Vib-nor (body normal vibrato). ). Then, in step S31, an element of this BOD-Vib-nor (body normal vibrato) is selected, and this is set as an object to be edited. Then, in order to realize the desired “substantially slow vibrato”, in step S32, among the template vectors of the BOD-Vib-nor (body normal vibrato), the vector TSC-B-vib of the time template is replaced with the existing TSC-B-vib. The time template is replaced with the vector of the time template that makes the vibrato speed the slowest (for example, TSC-B-sp1).
However, if the desired “substantially slow vibrato” cannot yet be realized with the time template specified by the time template vector TSC-B-sp1, this time template vector TSC-B-sp1 is selected in step S38, and step 39 is performed. Then, the specific data contents are changed to contents that realize slower vibrato. Also, new vector data (for example, TSC-B-sp0) is assigned to a new time template created by the change.
[0074]
Thus, new time template data and its vector data TSC-B-sp0 are created (step S40). Further, a new articulation element (AE) in which the time template vector is changed to a new vector is created, and an element for the body part of the sequence AESEQ # 6 is replaced with the newly created articulation element (AE). A new articulation element sequence replaced with AE) is created (step S40).
Subsequent steps S41, S42 and S43 are the same as steps S27, S28 and S29 described above. That is, the connection of the waveforms between the elements in the newly created articulation element sequence is not guaranteed by the corrected template data. Therefore, the connection rule data RULE is set again as described above.
[0075]
In step S43, an inquiry is made as to whether to continue the editing process. If the editing process is not to be continued, the process proceeds to step S44, and the newly created template data is registered in the template database TDB. If the user wants to continue the editing process, the determination in step S43 is YES, and the process returns to step S38. After step S44, the process proceeds to step S37, and the newly created articulation element (AE) is registered in the articulation database ADB as a user articulation element vector (AEVQ). Further, the process proceeds to step S30, and the newly created articulation element sequence is registered as a user sequence URSEQ in the articulation database ADB.
The procedure of the editing process is not limited to that shown in FIG. 12, and another process may be performed as appropriate. Also, as described above, without calling the base sequence AESEQ, a desired element is sequentially selected from the element vector AEVQ, the template data in each element is appropriately replaced or data is corrected, and the user The sequence URSEQ may be created. Although not particularly shown, it is preferable that a sound corresponding to the waveform of the articulation element being edited is generated at an appropriate stage of the editing process so that the user can confirm the sound by ear.
[0076]
[Explanation of partial vector]
FIG. 13 conceptually shows the concept of the partial vector PVQ. FIG. 13A schematically shows data (that is, normal template data) of an articulation element of a certain section, which is analyzed for a certain tone element (for example, a waveform) of the whole section. FIG. 13B schematically shows partial template data PT1, PT2, PT3, and PT4 extracted in a distributed manner from the data of all sections shown in FIG. The partial template data PT1, PT2, PT3, PT4 are stored in the template database TDB as template data of the musical tone element. One template vector is assigned to this template data in the same manner as usual (similar to the case where data of all sections is stored as template data as it is). For example, if the template vector for this template data is "Timb-B-nor", the template vector for each partial data PT1, PT2, PT3, PT4 is "Timb-B-nor", ing. In this case, identification data indicating that the template vector has the partial vector PVQ is registered as data attached to the template vector “Timb-B-nor”.
[0077]
The partial vector PVQ includes, for each partial template data PT1 to PT4, data indicating a storage position of the data in the template database TDB (for example, corresponding to a loop start address) and data indicating a width W of the data (for example, Loop end address) and data indicating a period LT during which the data is repeated. In the figure, for the sake of convenience, the width W and the period LT are illustrated as if they are common to all the partial data PT1 to PT4, but this is arbitrary for each of the data PT1 to PT4. Also, the number of partial template data PT1 to PT4 is not limited to four and is arbitrary.
Each of the partial template data PT1 to PT4 based on the partial vector PVQ is read out in a loop for the repetition period (LT), and the read out loops are connected to each other, thereby reading the entire section as shown in FIG. Data can be reproduced. This reproduction process is called a decoding process. As this decoding processing method, as an example, each of the partial template data PT1 to PT4 may simply be read in a loop for the repetition period LT, or as another example, two successive The cross-fade synthesis may be performed while reading the waveform in a loop. The latter is preferable because the connection between the loops is improved.
[0078]
FIGS. 13 (c) and 13 (d) show examples of decoding processing by such cross-fade synthesis. (C) shows an example of a crossfade control waveform in the first channel for crossfade synthesis, and (d) shows an example of a crossfade control waveform in the second channel for crossfade synthesis. That is, the first partial template data PT1 is faded out during the period LT with the fade-out control waveform CF11 shown in (c), and at the same time, the next partial template data PT2 is faded out with the fade-in control waveform CF21 shown in (d). Fades in during the period LT. By adding the data PT1 subjected to the fade-out control and the data PT2 subjected to the fade-in control, a loop read for cross-fading from the data PT1 to the data PT2 during the period LT is performed. Next, the data PT1 is switched to the data PT3, the control waveform is switched to the fade-in waveform CF12, the control waveform of the data PT2 is switched to the fade-out waveform CF22, and cross-fade synthesis is performed. Thereafter, crossfade synthesis is performed by sequentially switching as shown in the figure. When performing crossfade synthesis, processing is performed so that the phases and pitches of the two loop readout waveforms match appropriately.
[0079]
FIG. 14 is a flowchart illustrating an example of the template reading process in consideration of the partial vector PVQ. Steps S13 to S14c shown here correspond to the processing of steps S13 and S14 in FIG. In step S13, vector data of each template corresponding to the specified element is read from the data group of the articulation element vector AEVQ. In step S14a, it is checked whether or not the partial vector PVQ exists based on the identification data indicating that the partial vector PVQ exists. If there is no partial vector PVQ, the procedure goes to step S14b, where each template data is read from the template database TDB. If there is a partial vector PVQ, the process proceeds to step S14c, and the above-described “decoding process” is performed based on the partial vector PVQ. Thereby, the template data of the entire section for the element is reproduced (decoded).
[0080]
When the partial vector PVQ is applied to a certain articulation element, it is not necessary to use templates for all musical elements of the articulation element as partial templates, and it is suitable for loop reading as a partial template. Only partial templates may be used for the different types of musical sound elements.
Further, the method of reproducing the template data of the entire section of the element based on the partial vector PVQ is not limited to the simple loop reading as described above, but may be any other appropriate method. For example, a partial template of a predetermined length corresponding to the partial vector PVQ is time-axis-extended as necessary, or a limited plurality of partial templates are randomly or combined in a predetermined sequence, and all sections of the element or A method of arranging over a necessary section may be used.
[0081]
[Explanation of vibrato synthesis]
Here are some new ideas on how to combine vibrato.
FIG. 15 is a diagram schematically showing an example in which waveform data of a body part having a vibrato component is subjected to data compression by applying the concept of the partial vector PVQ, and a decoding example thereof. (A) illustrates an original waveform A including vibrato. In this original waveform, not only the waveform pitch fluctuates in one cycle of vibrato, but also the amplitude fluctuates. (B) illustrates a state where a plurality of waveforms a1, a2, a3, and a4 are dispersedly extracted from the original waveform of (a). As these waveforms a1 to a4, those having different waveform shapes (tone colors) are selected, and one wavelength (one cycle of the waveform) is extracted as one or more waves with the same data size (number of addresses). These waveforms a1 to a4 are stored in the template database TDB as partial template data (that is, loop waveform data). This reading method is performed by sequentially reading out the waveforms a1 to a4 in a loop and performing crossfade synthesis.
[0082]
FIG. 15C shows a pitch template whose pitch changes during one vibrato cycle. Note that the pitch change pattern of this pitch template is a pattern which starts from a high pitch, shifts to a low pitch, and finally returns to a high pitch in the drawing, but is not limited to this, and is not limited to this. A pattern that shifts and returns to a low pitch, or a pattern that starts from an intermediate pitch and returns to a high pitch → low pitch → intermediate pitch) may be used.
[0083]
FIG. 15D illustrates a cross-fade control waveform for each of the waveforms a1 to a4 read out from the loop. At first, the waveforms a1 and a2 are loop-read (repeatedly read) at the pitch according to the pitch template of (c), and the amplitude control of fade-out is performed for the loop-read waveform a1 and fade-in is performed for the loop-read waveform a2. To combine the two. As a result, the waveform shape cross-fades from the waveforms a1 to a2 and changes sequentially, and the pitch of the cross-fade composite waveform sequentially changes at a pitch according to the pitch template. Thereafter, the waveforms are sequentially switched in the same manner, and crossfade synthesis is performed at a2 and a3, then at a3 and a4, and then at a4 and a1.
[0084]
FIG. 15E shows the synthesized waveform data A ′. The waveform data A 'is changed by smoothly cross-fading the waveform shape from the waveform a1 to a4 in order from one cycle of the vibrato and changing the pitch according to the pitch template during one cycle of the vibrato. Is attached. By repeating the process of synthesizing the waveform data A 'for one vibrato cycle as described above, waveform data over a plurality of vibrato cycles can be synthesized. In such a case, the pitch template for one cycle of vibrato as shown in FIG. Therefore, the structure of the partial vector PVQ may be hierarchical. That is, the hierarchical structure in which the waveforms a1 to a4 are individually read out in a loop as described above for synthesizing the waveform for one cycle of vibrato, and the whole (for one cycle of vibrato) is further repeated according to the looping of the pitch template. It may be.
[0085]
FIG. 16 is a diagram showing another example of another vibrato synthesis. In this example, a plurality of waveforms a1 to a4, b1 to b4, and c1 to c4 are dispersedly extracted from sections A, B, and C of an original waveform including vibrato over a plurality of vibrato periods. As described above, the waveforms a1 to a4, b1 to b4, and c1 to c4 have different waveform shapes (tone colors), and one wavelength (one cycle of the waveform) has the same data size (number of addresses). ) Is extracted in one or more waves. These waveforms a1 to a4, b1 to b4, c1 to c4 are stored in the template database TDB as partial template data. This readout method basically reads out the waveforms a1 to a4, b1 to b4, and c1 to c4 sequentially in a loop and performs crossfade synthesis similarly to the above example. In the example of FIG. 16, the time positions of the waveforms a1 to a4, b1 to b4, and c1 to c4 are interchanged, and the waveforms to be subjected to crossfade synthesis are arbitrarily combined. The point is that various combinations can be obtained.
[0086]
For example, if the positions of these waveforms a1 to a4, b1 to b4, and c1 to c4 are exchanged without changing the relative time positions within one vibrato cycle, for example, a1 → b2 → c3 → a4 It is possible to obtain a waveform position replacement pattern such as → b1 → c2 → a3 → b4 → c1 → a2 → b3 → c4. If vibrato synthesis processing by crossfade synthesis similar to that of FIG. 15 is performed in accordance with such a waveform position replacement pattern, tone color changes different from vibrato obtained by vibrato synthesis processing by crossfade synthesis according to the original waveform position pattern. You can get vibrato. The reason why the positions of the waveforms a1 to a4, b1 to b4, and c1 to c4 are changed without changing the relative time position within one vibrato cycle is that the unnaturalness due to the replacement is used. This is to prevent the occurrence of.
In the case of the twelve waveforms a1 to a4, b1 to b4, and c1 to c4 shown in FIG. 16, there are 81 combinations of vibrato in one cycle of vibrato. In three cycles, there are 81 powers of three. Therefore, the variation of the waveform timbre change in vibrato becomes extremely diverse. What combination pattern should be adopted may be randomly selected.
[0087]
For a waveform having a vibrato characteristic created by the method shown in FIG. 15 or FIG. 16 (for example, A ′ in FIG. 15E) or a waveform having a vibrato characteristic created by another method, the pitch ( The vibrato characteristic can be variably controlled by a Pitch) template, an amplitude (Amp) template, and a time (TSC) template. For example, the pitch (Pitch) template can control the vibrato depth, the amplitude (Amp) template can control the depth of the amplitude modulation added with the vibrato, and the time (TSC) template can control the vibrato 1. Vibrato speed can be controlled (vibrato cycle can be controlled) by controlling the expansion and contraction of the time length of the waveform constituting the cycle.
[0088]
For example, in FIG. 15, the time length of each cross-fade section shown in FIG. 15D is subjected to time-axis expansion / contraction control (TSC control) in accordance with a desired time (TSC) template, so that the tone reproduction pitch (change of the waveform read address) When the TSC control is performed without changing the rate, the time length of one vibrato cycle can be controlled to expand and contract, whereby the vibrato frequency can be controlled. In this case, when a TSC template is prepared corresponding to one vibrato cycle in the same manner as a pitch template as shown in FIG. You just need to loop. It should be noted that if the pitch (Pitch) template and the amplitude (Amp) template are also subjected to the time axis expansion / contraction control in conjunction with the time axis expansion / contraction control of the waveform corresponding to the TSC template, these musical tone elements may be interlocked in the time axis. Expansion and contraction can be controlled.
It is also possible to variably control the tone reproduction pitch of the vibrato waveform by shifting the pitch change envelope characteristic indicated by the pitch template up and down. In this case, by not performing the time axis control of the waveform using the TSC template, it is possible to control the time length of one vibrato cycle to be constant regardless of the musical sound reproduction pitch.
[0089]
[Description of connection rule RULE]
Next, a specific example of the rule data RULE that describes how to connect the articulation elements will be described.
For each tone element, for example, there are the following connection rules.
(1) Waveform (Timbre) template connection rules
Rule 1: Direct connection. If a smooth connection between each articulation element is guaranteed in advance as in a preset rendition style sequence (articulation element sequence AESEQ), there is a problem in connecting directly without performing interpolation. Absent.
Rule 2: interpolation in which the end of waveform A of the preceding element is extended. This interpolation example has a form as shown in FIG. 17A, in which the connection waveform C1 is synthesized by extending the end portion of the waveform A of the preceding element. The waveform B of the succeeding element is used as it is, and the connecting waveform C1 extending to the end of the waveform A of the preceding element is faded out, and the beginning of the waveform B of the succeeding element is faded in to perform cross-fade synthesis. The connection waveform C1 is formed by repeating a one-period waveform or a plural-period waveform at the end of the waveform A of the preceding element by a required length.
[0090]
Rule 3: interpolation in which the leading end of waveform B of the succeeding element is extended. This interpolation example has a form as shown in FIG. 17B, in which the connection waveform C2 is synthesized by extending the leading end of the waveform B of the succeeding element. The waveform A of the preceding element is used as it is, and the end portion of the waveform A of the preceding element is faded out, and the connection waveform C2 is faded in to perform cross-fade synthesis. Also in this case, the connection waveform C2 is formed by repeating a one-period waveform or a plurality of periodic waveforms at the leading end of the waveform B of the succeeding element by a required length.
Rule 4: Interpolation in which both the end portion of waveform A of the preceding element and the tip portion of waveform B of the succeeding element are extended. This interpolation example has a form as shown in FIG. 17C, in which the connection waveform C1 obtained by extending the end portion of the waveform A of the preceding element and synthesized and the connection waveform obtained by expanding the front end portion of the waveform B of the subsequent element. Cross-fade synthesis with the use waveform C2. In the case of Rule 4, since the time of the entire synthesized waveform is extended by the cross-fade synthesis period of C1 and C2, the time axis compression process is performed by the TSC control. .
[0091]
Rule 5: As shown in FIG. 17D, a connection waveform C prepared in advance is inserted between the waveform A of the preceding element and the waveform B of the succeeding element. At this time, the end of the waveform A of the preceding element and the leading end of the waveform B of the succeeding element are partially removed by the connection waveform C. Alternatively, the connecting waveform C may be inserted without deleting the end portion of the waveform A of the preceding element and the leading end portion of the waveform B of the succeeding element. In that case, the time of the entire synthesized waveform is extended. Therefore, the time axis compression process is performed by the TSC control.
Rule 6: As shown in FIG. 17 (e), a connection waveform C prepared in advance is inserted between the waveform A of the preceding element and the waveform B of the succeeding element, and at this time, the end of the waveform A of the preceding element The part and the first half of the connection waveform C are subjected to cross-fade loss-fade synthesis, and the leading end of the waveform B of the succeeding element and the second half of the connection waveform C are subjected to cross-fade loss-fade synthesis. Also in this case, if the time of the entire synthesized waveform is extended or shortened, the time axis compression process is performed by the TSC control.
[0092]
(2) Connection rules for other templates
Since the data of templates (amplitude, pitch, and time) other than the waveform (Timbre) template take a simple form of an envelope waveform, a complicated interpolation process using a two-channel cross-fade control waveform is not used. In addition, a smooth connection can be realized by simpler interpolation processing. In particular, when performing interpolation synthesis of template data having an envelope waveform, it is preferable to generate an interpolation result as a difference value (with a plus or minus sign) from the original template data value. Then, an interpolation operation for smooth connection can be achieved only by adding a difference value (with a plus or minus sign) as an interpolation result to the original template data value read from the template database TDB in real time. It can be done and it is very simple.
Rule 1: Direct connection. This example is shown in FIG. The end of the template (envelope waveform) AE1 of the first element and the level of the start of the template (envelope waveform) AE2-a of the second element match, and the template (envelope waveform) AE2-a of the second element And the template (envelope waveform) AE3 at the end of the third element also coincides with the level at the head of the AE3, so that there is no need for interpolation.
[0093]
Rule 2: Perform an interpolation process for smoothing in a local range before and after the connection point. This example is shown in FIG. A smooth transition from AE1 to AE2-b is performed in a predetermined range CFT1 in the end portion of the template (envelope waveform) AE1 of the first element and the leading end of the template (envelope waveform) AE2-b of the second element. Performs interpolation processing. Further, the transition from AE2-b to AE3 is made smoothly within a predetermined range CFT2 in the end portion of the template (envelope waveform) AE2-b of the second element and the leading end of the template (envelope waveform) AE3 of the third element. Interpolation processing is performed as described above.
The data E1 ', E2', E3 'obtained as a result of the interpolation are assumed to be the difference values (with positive and negative signs) from the original template values (envelope values) E1, E2, E3 of each element. By doing so, as described above, the difference values E1 ', E2', E3 ', which are the interpolation results, are simply added to the original template data values E1, E2, E3 read from the template database TDB in real time. The interpolation operation for smooth connection can be achieved, which is extremely simple.
[0094]
As shown in FIGS. 19A, 19B, and 19C, there are a plurality of variations of the specific example of the interpolation processing according to Rule 2.
In the example of FIG. 19A, the intermediate level MP between the template data value EP at the end point of the preceding element AEn and the template data value SP at the start point of the succeeding element AEn + 1 is set as the target value, and the end part of the preceding element AEn is set as the target value. In the interpolation area RCFT, interpolation is performed so that the template data value of the preceding element AEn approaches the target value MP. As a result, the locus of the template data of the preceding element AEn changes from the original line E1 to the original line E1 '. Further, in the interpolation area FCFT at the leading end of the succeeding element AEn + 1, the interpolation is performed so that the template data value of the succeeding element AEn + 1 is started from the above-mentioned intermediate value MP and approaches the locus of the original template data value indicated by the line E2. As a result, the locus of the template data value of the succeeding element AEn + 1 in the interpolation area FCFT gradually approaches the original locus E2 as shown by the line E2 '.
[0095]
In the example of FIG. 19B, the template data value SP at the start point of the succeeding element AEn + 1 is set as the target value, and the template data value of the preceding element AEn is set as the target value SP in the interpolation area RCFT at the end of the preceding element AEn. Interpolate to asymptotically. As a result, the locus of the template data of the preceding element AEn changes from the original line E1 to E1 ''. In this case, there is no interpolation area FCFT at the leading end of the succeeding element AEn + 1.
In the example of FIG. 19C, in the interpolation area FCFT at the leading end of the succeeding element AEn + 1, the template data value of the succeeding element AEn + 1 is started from the template data value EP at the end point of the preceding element AEn, and is indicated by a line E2. Interpolation is performed so as to asymptotically approach the original template data value locus. As a result, the locus of the template data value of the succeeding element AEn + 1 in the interpolation area FCFT gradually approaches the original locus E2 as shown by the line E2 ''. In this case, there is no interpolation area RCFT at the rear end of the preceding element AEn.
Also in FIG. 19, it is assumed that the data indicating the trajectories E1 ′, E2 ′, E1 ″, and E2 ″ obtained as a result of the interpolation is a difference value from the original template data values E1 and E2.
[0096]
Rule 3: Perform interpolation processing for smoothing over the entire section of the element. This example is shown in FIG. The template (envelope waveform) AE1 of the first element and the template (envelope waveform) AE3 of the third element are not changed, and the data of the template (envelope waveform) AE2-b of the second element in the middle is not changed. Interpolation is performed such that the leading end matches the end of the template (envelope waveform) AE1 of the first element and the end matches the beginning of the template (envelope waveform) AE3 of the third element. Also in this case, the data E2 ′ obtained as a result of the interpolation is assumed to be composed of a difference value (with a plus or minus sign) from the original template value (envelope value) E2.
As shown in FIGS. 20A, 20B, and 20C, there are a plurality of variations of the specific example of the interpolation processing of Rule 3.
FIG. 20A shows an example in which interpolation is performed using only the intermediate element AEn. E1 indicates the original trajectory of the template data value of the element AEn. The locus E1 of the template data value of the element AEn is shifted according to the difference between the template data value EP0 of the end point of the preceding element AEn-1 and the template data value SP of the original start point of the intermediate element AEn. , The template data including the trajectory Ea is created for all the sections of the element AEn. The locus E1 of the template data value of the element AEn is shifted according to the difference between the template data value EP of the original end point of the intermediate element AEn and the template data value SP1 of the start point of the succeeding element AEn + 1. , Template data including the trajectory Eb is created for all the sections of the element AEn. Next, the template data of the trajectory Ea and the template data of the trajectory Eb are cross-fade-interpolated so as to smoothly change from Ea to Eb, and the interpolated template data of the trajectory E1 ′ corresponds to all sections of the element AEn. Get it.
[0097]
FIG. 20B shows an example in which data is changed in the entire section of the intermediate element AEn and interpolation is performed in a predetermined area RCFT at the end of the intermediate element AEn and a predetermined area FCFT at the end of the succeeding element AEn + 1. Is shown.
First, similarly to the above, according to the difference between the template data value EP0 of the end point of the preceding element AEn-1 and the template data value SP of the original start point of the intermediate element AEn, Is shifted, and template data including the trajectory Ea is created corresponding to all sections of the element AEn.
[0098]
Next, in a predetermined area RCFT at the end portion of the preceding element AEn, a target level is an intermediate level MPa between the template data value EPa of the ending point of the trajectory Ea and the template data value SP of the starting point of the succeeding element AEn + 1. Interpolation is performed so that the template data value of the trajectory Ea of the preceding element AEn approaches the target value MPa. As a result, the locus Ea of the template data of the preceding element AEn changes from the original locus as shown by Ea '. Further, in a predetermined area FCFT at the leading end of the succeeding element AEn + 1, interpolation is performed so that the template data value of the succeeding element AEn + 1 is started from the above-mentioned intermediate value MPa and approaches the locus of the original template data value indicated by the line E2. As a result, the locus of the template data value of the succeeding element AEn + 1 in the interpolation area FCFT gradually approaches the original locus E2 as shown by the line E2 '.
[0099]
FIG. 20C shows that data is changed in the entire section of the intermediate element AEn, and interpolation is performed in a predetermined area RCFT at the end of the preceding element AEn-1 and a predetermined area FCFT at the end of the intermediate element AEn. In addition, an example is shown in which interpolation is performed between a predetermined area RCFT at the end of the intermediate element AEn and a predetermined area FCFT at the front end of the succeeding element AEn + 1.
First, the original trajectory E1 of the template data value of the intermediate element AEn is shifted by an appropriate offset amount OFST, and template data including the trajectory Ec is created corresponding to all the sections of the element AEn.
[0100]
Next, in a predetermined region RCFT at the end portion of the preceding element AEn-1 and a predetermined region FCFT at the front end portion of the intermediate element AEn, interpolation processing is performed so that the trajectories E0 and Ec of both template data are smoothly connected. Trajectories E0 ′ and Ec ′ as interpolation results are obtained in the interpolation area. Further, in the predetermined area RCFT at the end of the intermediate element AEn and the predetermined area FCFT at the front end of the succeeding element AEn + 1, an interpolation process is performed so that the trajectories Ec and E2 of both template data are smoothly connected. Trajectories Ec ″ and E2 ″ are obtained in the interpolation area.
Also in FIG. 20, the data indicating the trajectories E1 ′, Ea, Ea ′, E2 ′, Ec, Ec ′, Ec ″, and E0 ′ obtained as a result of the interpolation correspond to the original template data values E1, E2, and E0. It shall consist of a difference value.
[0101]
[Conceptual explanation of tone synthesis processing including connection processing]
FIG. 21 is a block diagram conceptually illustrating the configuration of a musical sound synthesizer that performs the above-described connection processing for each template data corresponding to each musical sound element and performs a musical sound synthesis processing based on the connected template data. It is.
In the template data supply blocks TB1, TB2, TB3, TB4, respectively, waveform template data Timb-Tn, amplitude template data Amp-Tn, pitch template data Pit-Tn, time template data TSC-Tn relating to the preceding articulation element. And waveform template data Timb-Tn + 1, amplitude template data Amp-Tn + 1, pitch template data Pit-Tn + 1, and time template data TSC-Tn + 1 for the following articulation element.
[0102]
In the rule data code processing blocks RB1, RB2, RB3, RB4, connection rules TimRULE, AmpRULE, PITRULE, TSCRULE for each musical tone element relating to the articulation element are decoded, and refer to FIGS. 17 to 20 according to the decoded connection rules. The connection processing described above is executed. For example, the rule decoding process block RB1 for the waveform template performs a process for executing the connection process (direct connection or cross-fade interpolation) described with reference to FIG.
[0103]
In addition, the rule decoding process block RB2 for the amplitude template performs a process for executing the connection process (direct connection or interpolation) as described with reference to FIGS. In this case, the interpolation result is given as a difference value (with a plus or minus sign) as described above, so that the interpolation data composed of the difference value output from the block RB2 is originally supplied from the template data supply block TB2 in the adder AD2. To the template data value. For the same reason, adders AD3 and AD4 for adding the outputs of the other rule data processing blocks RB3 and RB4 and the original template data values supplied from the template data supply blocks TB3 and TB4, respectively, are provided. Has been.
[0104]
In this way, template data Amp, Pitch, and TSC obtained by performing necessary connection processing between adjacent elements are output from the adders AD2, AD3, and AD4, respectively. The pitch control block CB3 controls the waveform reading speed according to the pitch template data Pitch. Since the waveform template itself contains the original pitch information, the original pitch information (original pitch envelope) is received from the database via the line L1, and the waveform is calculated based on the deviation between the original pitch envelope and the pitch template data Pitch. Control the reading speed. For example, when the original pitch envelope and the pitch template data Pitch are the same, the reading may be performed at a constant waveform reading speed. When the original pitch envelope and the pitch template data Pitch are different, only the deviation is used. What is necessary is just to variably control the waveform reading speed. Further, the pitch control block CB3 receives note instruction data, and controls the waveform reading speed based on the note instruction data. For example, assuming that the original pitch of the waveform template data is based on the pitch of the note C4 and that the sound of the note D4 is also generated by using the waveform template data having the original pitch of the note C4, The waveform reading speed is controlled according to the difference between the note D4 and the note C4 having the original pitch. Details of such pitch control will not be described in detail because a known technique can be applied.
[0105]
Basically, the waveform access control block CB1 sequentially reads out each sample of the waveform template data according to the waveform readout speed control information output from the pitch control block CB3. At this time, the waveform reading mode is controlled according to the TSC control information given as time template data, and the pitch of the generated sound is determined according to the waveform reading speed control information given from the pitch control block CB3, while the total waveform reading time is Variable control is performed according to the TSC control information. For example, if the sounding time length is to be longer than the time length of the original waveform data, the desired pitch can be maintained if the waveform reading speed is kept as it is and some of the waveform portions are read out in duplicate. It is possible to extend the pronunciation time length while doing so. Also, when compressing the sounding time length more than the time length of the original waveform data, a desired pitch can be maintained if the waveform reading speed is kept as it is and some of the waveform portions are skipped and read. The length of the sounding time can be compressed while performing.
The waveform access control block CB1 and the crossfade control block CB2 execute the connection processing (direct connection or crossfade interpolation) described with reference to FIG. 17 according to the output of the waveform template rule decoding block RB1. To perform the processing. The crossfade control block CB2 is also used when performing crossfade processing while loop-reading a partial waveform template according to the partial vector PVQ. It is also used for smoothing the waveform connection during the TSC control.
[0106]
The amplitude control block CB4 gives an amplitude envelope according to the amplitude template Amp to the generated waveform data. Also in this case, since the waveform template itself contains the original amplitude envelope information, the original amplitude envelope information is received from the database via the line L2, and the waveform is calculated based on the deviation between the original amplitude envelope and the amplitude template data Amp. Controls data amplitude. For example, when the original amplitude envelope and the amplitude template data Amp are the same, the amplitude control block CB4 only needs to pass the waveform data without performing substantial amplitude control. If the original amplitude envelope is different from the amplitude template data Amp, the amplitude level may be variably controlled by the deviation.
[0107]
[Specific example of musical sound synthesizer]
FIG. 22 is a block diagram showing an example of a hardware configuration of the musical sound synthesizer according to the embodiment of the present invention. The musical sound synthesizer may take any product application form, such as an electronic musical instrument, a karaoke device, an electronic game device, other multimedia equipment, or a personal computer.
According to the configuration shown in FIG. 22, the tone synthesis processing according to the embodiment of the present invention is executed using a software sound source. A software system is constructed so as to realize the tone data creation and tone synthesis processing according to the present invention, and a required database DB is constructed in an attached memory device, or is communicated to a database DB constructed outside (host). An embodiment of accessing via a line is taken.
[0108]
In the tone synthesizer of FIG. 22, a CPU (central processing unit) 10 is used as a main control unit, and under the control of the CPU 10, a software program for realizing tone data creation and tone synthesis processing according to the present invention. And the program of the software sound source is executed. Of course, the CPU 10 can also execute other appropriate programs in parallel.
The CPU 10 includes a ROM (read only memory) 11, a RAM (random access memory) 12, a hard disk device 13, a first removable disk device (for example, a CD-ROM drive or MO drive) 14, and a second removable disk device (for example, A floppy disk drive 15, a display 16, an input operation device 17 such as a keyboard and a mouse, a waveform interface 18, a timer 19, a network interface 20, a MIDI interface 21 and the like are connected via a data and address bus 22.
[0109]
FIG. 23 shows a detailed example of the waveform interface 18 and a configuration example of the waveform buffer in the RAM 12. The waveform interface 18 controls both acquisition (sampling) and output of waveform data, and an analog / digital converter (ADC) for sampling waveform data input from an external device by a microphone or the like and performing analog / digital conversion. 23, a first DMAC (direct memory access controller) 24 for sampling, a sampling clock generating circuit 25 for generating a sampling clock Fs of a predetermined frequency, and a second DMAC (direct memory controller) for controlling output of waveform data. 1 includes a memory access controller 26 and a digital / analog converter (DAC) 27 for digital / analog conversion of output waveform data. The second DMAC 26 also has a function of creating absolute time information based on the sampling clock Fs and providing the absolute time information to the bus 22 of the CPU.
[0110]
The RAM 12 has a plurality of waveform buffers W-BUF. One waveform buffer W-BUF has a storage capacity (number of addresses) for storing waveform sample data for one frame. For example, if the reproduction sampling frequency based on the sampling clock Fs is 48 kHz and the time of one frame section is 10 milliseconds, one waveform buffer W-BUF has a capacity to store 480 samples of waveform sample data. When at least two waveform buffers W-BUF (A, B) are used and one waveform buffer W-BUF is in read mode and accessed by DMAC 26 of waveform interface 18, the other waveform buffer W-BUF is written. Mode and writes the generated waveform sample data. In the tone synthesis program according to the present embodiment, one waveform buffer W-BUF in a write mode is generated by collectively generating waveform sample data consisting of a plurality of samples for one frame for each tone synthesis channel. The waveform sample data of each channel is added (accumulated) to each sample position (address position). For example, assuming that one frame is composed of 480 samples, 480 samples of waveform sample data for the first tone synthesis channel are collectively operated and stored in each sample position (address position) of the waveform buffer W-BUF. Next, 480 samples of waveform sample data for the second tone synthesis channel are collectively operated and added (accumulated) to each sample position (address position) of the same waveform buffer W-BUF. Hereinafter, the same applies. Therefore, when the calculation of generating the waveform sample data for one frame for all the channels is completed, the waveform samples of all the channels are stored in each sample position (address position) of one waveform buffer W-BUF in the write mode. Total waveform sample data obtained by accumulating data for each sample is stored. For example, first, the total waveform sample data for one frame is written to the waveform buffer W-BUF of A, and then the total waveform sample data for one frame is written to the waveform buffer W-BUF of B. The waveform buffer W-BUF of A shifts to the read mode from the beginning of the next frame section as soon as writing is completed, and is read regularly at a predetermined reproduction sampling cycle based on the sampling clock Fs during the frame section. . Therefore, basically, the read and write modes of the two waveform buffers W-BUF (A, B) may be alternately switched and used. However, if there is a margin for writing ahead several frames, Three or more waveform buffers W-BUF (A, B, C,...) May be used.
[0111]
Under the control of the CPU 10, a software program for realizing musical tone data creation and musical tone synthesis processing according to the present invention is stored in any of the ROM 11, RAM 12, hard disk device 13, or removable disk devices 14, 15. Is also good. Further, it is connected to a communication network via the network interface 20 and, from an external server computer (not shown), the above-mentioned "program for realizing tone data creation and tone synthesis processing according to the present invention", data in a database DB, etc. May be received and stored in the internal RAM 12, the hard disk 13, or the removable disk devices 14, 15, or the like. The CPU 10 executes, for example, a “program for realizing tone data creation and tone synthesis processing according to the present invention” stored in the RAM 12 to synthesize a tone according to a rendition style sequence, and combines the synthesized tone waveform data in the RAM 12. The data is temporarily stored in the waveform buffer W-BUF. Under the control of the DMAC 26, waveform data is read from the waveform buffer W-BUF in the RAM 12, sent to a digital / analog converter (DAC) 27, and D / A converted. The D / A converted musical sound waveform data is applied to a sound system (not shown) and spatially generated.
[0112]
As shown in FIG. 8A, the following description will be made on the assumption that the data of the rendition style sequence (articulation element sequence AESEQ) according to the present invention is incorporated in the automatic performance sequence data composed of MIDI data. Although not specifically described in FIG. 8A, the data of the rendition style sequence (articulation element sequence AESEQ) can be incorporated in the form of MIDI format, for example, as exclusive data of MIDI.
[0113]
FIG. 24 is a time chart showing an outline of a musical sound generation process executed by a software sound source based on performance data in the MIDI format. “Performance timing” shown in FIG. 8A includes a MIDI note-on event, a note-off event or another event (EVENT (MIDI) in FIG. 8A), and an articulation element sequence event (FIG. 8A). The timing of occurrence of each event # 1 to # 4, such as EVENT (AESEQ) in FIG. (B) illustrates the relationship between the timing at which waveform sample data is generated and calculated (“waveform generation”) and its reproduction timing (“waveform reproduction”). The “Waveform Generation” column at the top shows each sample of one waveform buffer W-BUF which is in a write mode by generating waveform sample data consisting of a plurality of samples for one frame for each tone synthesis channel at once. The timing at which the process of adding (accumulating) the waveform sample data of each channel to the position (address position) is illustrated. The lower column of “waveform reproduction” indicates the timing at which processing for regularly reading out waveform sample data from the waveform buffer W-BUF at a predetermined reproduction sampling cycle based on the sampling clock Fs during one frame period. The indications of A and B appended to the respective symbols are symbols for distinguishing the waveform buffer W-BUF to be written or read. .., FR1, FR2, FR3,... Are provisionally assigned frame numbers. For example, the waveform sample data for a certain frame subjected to the waveform generation operation at the time of the frame FR1 is written into the waveform buffer W-BUF of A, and this is read from the waveform buffer W-BUF of A at the next frame FR2. Is read. The waveform sample data for the next one frame is generated and calculated in the frame FR2, and is written to the B waveform buffer W-BUF. The waveform sample data for one frame stored in the B waveform buffer W-BUF is read from the B waveform buffer W-BUF in the next frame FR3. Events # 1, # 2, and # 3 shown in (a) occur within the time of one frame, and the generation calculation of waveform sample data corresponding to these events # 1, # 2, and # 3 is (b) ) In frame FR3. Therefore, the rise of the musical tone corresponding to these events # 1, # 2, and # 3 (start of sound generation) is started in the next frame FR4. Δt indicates the difference between the timing of occurrence of the events # 1, # 2, and # 3 given as MIDI performance data and the timing of starting the generation of the corresponding musical tone. Since this time lag Δt is only for one to several frames, there is no problem in audibility. Note that the waveform sample data at the start of sound generation is not written from the beginning of the waveform buffer W-BUF, but is written from a predetermined halfway position in the waveform buffer W-BUF corresponding to the start time.
[0114]
The method of generating and calculating waveform sample data in “waveform generation” includes an automatic performance sound based on a normal MIDI note-on event (this is referred to as a “normal performance” sound) and an articulation element sequence AESEQ. Is different from the performance sound based on the on-event (hereinafter, referred to as “performance style performance” sound). The “normal performance” processing based on the normal MIDI note-on event and the “performance technique performance” processing based on the articulation element sequence AESEQ on event are performed in separate processing routines as shown in FIGS. 29 and 30. Be executed. For example, when the accompaniment part is performed by "normal performance" based on a normal MIDI note-on event, and the specific solo performance part is performed by "performance technique" based on the articulation element sequence AESEQ, it is effective to use the differently. It is.
[0115]
FIG. 25 is a time chart showing an outline of a "performance technique performance" process (articulation element tone synthesis process) based on performance technique sequence (articulation element sequence AESEQ) data according to the present invention. The “phrase preparation command” and the “phrase start command” are included in MIDI performance data as “articulation element sequence event EVENT (AESEQ)” as shown in FIG. That is, the event data of one articulation element sequence AESEQ (referred to as “phrase” in FIG. 25) includes a “phrase preparation command” and a “phrase start command”. The preceding event data "phrase preparation command" specifies an articulation element sequence AESEQ (that is, a phrase) to be reproduced, and indicates that preparation for reproduction is to be performed. It is given a predetermined time before the sound generation start time of the sequence AESEQ. In the “preparation processing” process shown in block 30, all data necessary to reproduce the specified articulation element sequence AESEQ is retrieved from the database DB according to the “phrase preparation command”, In the buffer area, and performs necessary preparations so that the articulation element sequence AESEQ is developed and the articulation element sequence can be reproduced immediately. In the "preparation process", the specified articulation element sequence AESEQ is interpreted, rules or the like for connecting successive articulation elements are set or determined, and necessary connection control data and the like are set. A forming process is also performed. For example, assuming that the designated articulation element sequence AESEQ is composed of five articulation elements AE # 1 to AE # 5 as shown in the figure, at each connection point (point indicated as connection 1 to connection 4). The connection rule is determined, and connection control data for the connection rule is formed. Also, data indicating the start time of each of the articulation elements AE # 1 to AE # 5 is prepared in a relative time expression from the start of the phrase. A “phrase start command”, which is event data following the “phrase preparation command”, instructs the start of sounding of the articulation element sequence AESEQ. In response to the "phrase start command", the articulation elements AE # 1 to AE # 5 prepared in the "preparation process" are sequentially reproduced. That is, when the start time of each of the articulation elements AE # 1 to AE # 5 arrives, the reproduction of the corresponding articulation elements AE # 1 to AE # 5 is started, and the connection points (connection 1 to connection In 4), predetermined connection processing is performed according to connection control data prepared in advance so as to be smoothly connected to the preceding articulation elements AE # 1 to AE # 4.
[0116]
FIG. 26 is a flowchart showing a main routine of a tone synthesis process executed by the CPU 10 of FIG. By the "automatic performance process" of the main routine, a process based on the event of the automatic performance sequence data is performed. First, in step S50, various necessary initialization processing such as securing various buffer areas on the RAM 12 is performed. Next, in step S51, it is checked whether or not each of the following activation factors has occurred.
Activation factor {circle around (1)}: MIDI performance data or other communication input data has been input via the interfaces 20 and 21.
Activation factor (2): Automatic performance processing timing has arrived. In order to check the time of occurrence of the next event in the automatic performance, the automatic performance processing timing occurs regularly.
Activation factor {circle around (3)}: The timing of generating a waveform in units of one frame has arrived. In order to generate the waveform sample data collectively in units of one frame, the waveform generation timing is generated in one frame cycle (for example, at the end of a frame section).
Activation factor {circle around (4)}: A switch operation of a keyboard or a mouse (except for a main routine end instruction operation) is performed by the input operation device 17.
Activation factor (5): Interrupt request from disk drives 13 to 15 or display unit 16
Activation factor {circle around (6)}: The end instruction operation of the main routine is performed by the input operation device 17.
[0117]
In step S52, it is determined whether any of the activation factors (1) to (6) has occurred. If NO, steps S51 and S52 are repeated, and if YES, it is determined in step S53 which activation factor has occurred. If the activation factor (1) has occurred, a predetermined “communication input process” is performed in step S54. If the activation factor (2) occurs, a predetermined "automatic performance process" (an example of which is shown in FIG. 27) is performed in step S55. If the activation factor (3) has occurred, a predetermined "sound source process" (an example of which is shown in FIG. 28) is performed in step S56. When the activation factor (4) occurs, a predetermined “SW process” (process corresponding to the operated switch) is performed in step S57. When the activation factor (5) occurs, a predetermined "other process" (process in response to the interrupt request) is performed in step S58. If the activation factor (6) has occurred, a predetermined "end processing" (processing for ending this main routine) is performed in step S59.
[0118]
If it is determined in step S53 that two or more activation factors among the activation factors (1) to (6) occur simultaneously, a predetermined priority (for example, the activation factor (1)) ▼, ２2, ３3, ▲ 4, ５5, ▲ 6) in this order. In that case, there may be equal priority processing. Steps S51 to S53 virtually represent task management in the pseudo multitask process. Actually, while the process is being executed based on the occurrence of any one of the activation factors, the process starts. In response to the occurrence of the activation factor having a higher priority, another process is executed by an interrupt (for example, while the “sound source process” is being executed based on the occurrence of the activation factor {3}, the activation factor { "Automatic performance processing" may be executed by interruption due to the occurrence of 2 ▼).
[0119]
A specific example of the "automatic performance process" (step S55) will be described with reference to FIG. First, in step S60, a process of comparing the absolute time information given from the DMAC 26 (FIG. 23) with the next event timing of the music data is performed. As shown in FIG. 8, in the music data, that is, the automatic performance data, duration data DUR exists prior to the event data EVENT. For example, when the duration data DUR is read, the absolute time information at that time and the duration data DUR are added to create and store the absolute time information indicating the arrival time of the next event. Then, the absolute time information indicating the arrival time of the next event is compared with the absolute time information at the present time in step S60 in FIG.
[0120]
In step S61, it is determined whether or not the current absolute time matches or elapses with the next event arrival time. If the next event has not yet arrived, the processing in FIG. 27 is immediately terminated. When the next event arrives, the process goes to step S62, and the type of the event is a normal performance event (ie, a normal MIDI event) or a performance style event (ie, an articulation element sequence event). Find out if there is. If it is a normal performance, the process goes to step S63, where a normal MIDI event process corresponding to the event is performed to generate sound source control data. In the next step S64, a tone synthesis channel (abbreviated as "sound source channel" in the figure) relating to the event is detected, and the number of the channel is registered in the channel number register i. For example, in the case of a note-on event, a channel to which the occurrence of the note is assigned is determined, and the channel is registered in the register i. In the case of a note-off event, the channel to which the occurrence of the note is assigned is detected, and the channel is registered in the register i. In the next step S65, the tone generator control data and the control timing data generated in step S63 are stored in the tone buffer TBUF (i) of the channel number designated by the register i. Note that the control timing is a timing at which control relating to the event is performed, such as a sound generation start timing for a note-on event and a release start timing for a note-off event. In this embodiment, since the tone waveform is generated by software processing, the timing of MIDI data event occurrence and the timing of actual processing corresponding thereto are slightly shifted. Thus, the actual control timing is instructed again.
[0121]
If it is determined in step S62 that the event is a performance technique event, the process proceeds to step S66 to check whether the event is a "phrase preparation command" or a "phrase start command" (see FIG. 25). If it is a "phrase preparation command", the routine of steps S67 to S71 is executed. The routine of steps S67 to S71 corresponds to “preparation processing” shown by block 30 in FIG. First, in step S67, a tone synthesis channel (abbreviated as "sound source channel" in the figure) for reproducing the phrase (that is, the articulation element sequence AESEQ) is determined, and the channel number is registered in the register i. In the next step S68, a performance style sequence (abbreviated as "performance style SEQ" in the figure) of the phrase (that is, articulation element sequence AESEQ) is developed. In other words, the articulation element sequence AESEQ is decomposed to the level of vector data capable of indicating an individual template, analyzed, and connected to each articulation element (AE # 1 to AE # 5 in FIG. 25). The connection rules in 1 to 4) are determined, and connection control data therefor is formed. In step S69, it is checked whether or not there is a subsequence (abbreviated as "subSEQ" in the figure). If there is, the process returns to step S68 to further decompose the subsequence to a level of vector data capable of indicating an individual template.
[0122]
FIG. 32 shows an example in which the articulation element sequence AESEQ includes a subsequence. As shown in FIG. 32, the articulation element sequence AESEQ may have a hierarchical structure. That is, in the figure, assuming that “performance style SEQ # 2” is specified by the data of the articulation element sequence AESEQ incorporated in the MIDI performance information, the specified sequence “performance style SEQ # 2” Is specified by “performance style SEQ # 6” and “element vector E-VEC # 5”. This “performance style SEQ # 6” corresponds to a subsequence. By analyzing this subsequence, the “performance style SEQ # 6” is specified by the element vectors E-VEC # 2 and E-VEC # 3. In this way, the "performance style SEQ # 2" specified by the data of the articulation element sequence AESEQ incorporated in the MIDI performance information is developed, and is developed by the element vectors E-VEC # 2, E-VEC # 3, and E-VEC # 3. -It is analyzed that it is specified by VEC # 5. As described above, at this time, connection control data for connecting each articulation element is also formed as needed. The element vector E-VEC is data that specifically specifies an individual articulation element. Of course, not only in the case of having such a hierarchical structure, each element vector is initially set by the “performance style SEQ # 2” specified by the data of the articulation element sequence AESEQ incorporated in the MIDI performance information. In some cases, E-VEC # 2, E-VEC # 3, and E-VEC # 5 are specified.
[0123]
In step S70, the data of each expanded element vector (abbreviated as "E-VEC" in the figure) together with the data indicating the control timing of the element vector along with the relative time are stored in the tone buffer TBUF (i) of the channel number designated by the register i. To be stored. In this case, the control timing is the start timing of each articulation element as shown in FIG. In the next step S71, necessary template data is loaded from the database DB to the RAM 12 with reference to the tone buffer TBUF (i).
If the current event is a "phrase start command" (see FIG. 25), the routine of steps S72 to S74 is executed. In this step S72, the channel to which the phrase performance is reproduced is detected, and the channel number is registered in the register i. In the next step S73, all control timing data stored in the tone buffer TBUF (i) of the channel number designated by the register i is converted into data in absolute time expression. That is, the absolute time information given by the DMAC 26 when the "phrase start command" is generated is used as an initial value, and the initial value is added to the relative time of each control timing data, so that each control timing data is converted to the absolute time. Can be converted to expression data. In the next step S74, the contents of the tone buffer TBUF (i) are rewritten according to the converted absolute time of each control timing. That is, the start time and end time of each element vector E-VEC constituting the rendition style sequence, the connection control data between the element vectors, and the like are written in the tone buffer TBUF (i).
[0124]
Next, a specific example of “sound source processing” (step S56 in FIG. 26) will be described with reference to FIG. As described above, the “sound source processing” is activated for each frame. First, in step S75, a predetermined waveform generation preparation process is performed. For example, the contents of the waveform buffer W-BUF that has been reproduced and read in the previous frame section are cleared, and data can be written to the waveform buffer W-BUF in the current frame section. In the next step S76, it is checked whether or not there is a channel for which sound generation processing is to be performed. If not, the process does not need to be continued, and the process jumps to step S83. If there is, the process goes to step S77 to specify one of the channels to be subjected to the tone generation process, and prepares to perform the waveform sample data generation process for the channel. In the next step S78, it is determined whether the type of musical tone assigned to the prepared channel is a "normal performance" sound or a "performance style" sound. If it is a "normal performance" sound, the process proceeds to step S79, and a process of generating one frame of waveform sample data for the channel as a "normal performance" sound is performed. If the sound is a “performance style” sound, the process proceeds to step S80, and a process of generating one frame of waveform sample data for the channel as a “performance style performance” sound is performed. Next, in a step S81, it is checked whether or not there is a remaining (unprocessed) channel among the channels to be subjected to the sound generation processing. If there is, the process goes to step S82 to specify a channel to be processed next from the remaining (unprocessed) channels, and prepares to perform waveform sample data generation processing for the channel. Then, the process returns to step S78, and the same processes as steps S78 to S80 described above are executed for the new channel. When the processes of steps S78 to S80 are completed for all the channels for which the sound generation process is to be performed, there is no remaining (unprocessed) channel, so that step S81 is NO, and the process proceeds to step S83. In this state, the generation of one frame of waveform sample data for all the channels to be sounded is completed, and they are added (accumulated) for each sample and stored in the waveform buffer W-BUF. . In step S83, the data in the waveform buffer W-BUF is delivered under the control of the waveform input / output (I / O) driver. Thus, in the next one frame period, the waveform buffer W-BUF enters the read mode, is accessed by the DMAC 26, and the waveform sample data is reproduced and read at a regular sampling cycle according to the predetermined sampling clock Fs.
[0125]
FIG. 29 shows a detailed example of the process in step S79 in FIG. FIG. 29 is a flow chart showing an example of "1 frame waveform data generation processing" for "normal performance", in which a normal tone synthesis processing based on MIDI performance data is performed. In this process, each time the loop of steps S90 to S98 is performed once, waveform data of one sample is generated. Therefore, address pointer management indicating the number of a sample in a frame which is currently being processed is performed, but this is not described in detail. First, in step S90, it is checked whether the control timing has arrived. This control timing is the timing specified again in step S65 in FIG. 27, and is, for example, a sound generation start timing or a release start timing (sound generation end timing). If there is any control timing with respect to the frame currently being processed, step S90 is YES in response to the address pointer value corresponding to the time of the control timing, and the process goes to step S91, where necessary control based on the sound source control data is performed. Performs waveform generation start processing. If the current address pointer value does not correspond to the control timing, step S91 is jumped to step S92. In step S92, processing for forming a low frequency signal (LFO) necessary for vibrato or the like is performed. In the next step S93, a process for forming an envelope signal (EG) for pitch control is performed.
[0126]
In the next step S94, based on the sound source control data, waveform sample data of a predetermined timbre is read out from a waveform memory (not shown) for a "normal performance" sound at a rate corresponding to the designated tone pitch. A process of interpolating the value of the read waveform sample data between samples is performed. Here, a commonly known waveform memory reading technique and inter-sample interpolation technique may be appropriately used. The tone pitch specified here is obtained by variably controlling the normal pitch of the note (pitch) related to the note-on event by the vibrato signal or the pitch control envelope value formed in the previous steps S92 and S93. In the next step S95, a process for forming an amplitude envelope (EG) is performed. In the next step S96, the volume level of the waveform data of one sample generated in step S94 is variably controlled by the amplitude envelope value formed in step S95, and this is controlled by the waveform buffer W-BUF indicated by the current address pointer. Add to the waveform sample data already stored at the address location. That is, addition and accumulation are performed on waveform sample data of another channel for the same sample point. Next, in step S97, it is determined whether or not processing for one frame has been completed. If not completed, the process proceeds to step S98 to prepare the next sample (the address pointer is advanced to the next).
[0127]
With the above configuration, when sound generation is started in the middle of a frame, waveform sample data is stored from an intermediate address of the waveform buffer W-BUF corresponding to the sound generation start position. Of course, when sound generation is continued throughout one frame period, waveform sample data is stored in all addresses of the waveform buffer W-BUF.
Note that the envelope formation processing in steps S93 and S95 may be performed by reading an envelope waveform memory, or may be performed by calculating a predetermined envelope function. As the envelope function, a well-known method of calculating a relatively simple linear function of the first order may be used. Note that, unlike the “performance technique performance” described later, in the “normal performance”, complicated processing such as replacement of a waveform during sound generation, replacement of an envelope, or control of time axis expansion / contraction of a waveform may not be performed.
[0128]
FIG. 30 shows a detailed example of the process of step S80 in FIG. FIG. 30 is a flowchart showing an example of the "process of generating waveform data for one frame" for the "performance technique", in which a tone synthesis process based on articulation (performance technique) sequence data is performed. In the processing of FIG. 30, the tone waveform processing of the articulation element based on each template data, the connection processing between element waveforms, and the like are executed in the manner already described. Similarly to FIG. 29, in the process of FIG. 30, each time the loop of steps S100 to S108 is performed once, waveform data of one sample is generated. Therefore, address pointer management indicating the number of a sample in a frame which is currently being processed is performed, but this is not described in detail. In the process of FIG. 30, in order to smoothly connect articulation elements that are adjacent to each other, cross-fade synthesis of two types of template data (including a waveform template) is performed, and time-series expansion / contraction control is performed. Cross-fade synthesis of two series of waveform sample data is performed. Therefore, for one sample point, various data processes for two series for crossfade synthesis are performed.
[0129]
First, in step S100, it is checked whether the control timing has arrived. This control timing is the timing written in step S74 in FIG. 27, and is, for example, the start timing of each of the articulation elements AE # 1 to AE # 5, the start timing of the connection processing, and the like. If there is any control timing for the frame currently being processed, this step S100 becomes YES in accordance with the address pointer value corresponding to the time of the control timing, and the process goes to step S101 to execute the element corresponding to the control timing. Necessary control is performed based on the vector E-VEC, connection control data, and the like. If the current address pointer value does not correspond to the control timing, step S101 is jumped to step S102.
[0130]
In step S102, a process of generating a time template (TMP is abbreviated in the figure) for a specific element specified by the element vector E-VEC is performed. The time template is the time template (TSC template) shown in FIG. In this embodiment, the time template (TSC template) is given as envelope-like data that changes over time, like the amplitude template and the pitch template. Therefore, in this step S102, processing for forming the envelope of the time template is performed.
In step S103, a process of generating a pitch template for a specific element specified by the element vector E-VEC is performed. The pitch template is also given as time-varying envelope-shaped data as illustrated in FIG.
In step S105, a process of generating an amplitude (Amp) template for a specific element specified by the element vector E-VEC is performed. The amplitude template is also given as envelope-like data that changes over time as illustrated in FIG.
[0131]
The envelope formation method in each of steps S102, S103, and S105 may be performed by reading an envelope waveform memory, or by calculating a predetermined envelope function, as described above. As the envelope function, a method of calculating a relatively simple linear function of the first order may be used. Also, as described with reference to FIGS. 18 to 20, templates are formed in two series (templates of preceding elements and templates of succeeding elements) corresponding to predetermined element connection locations, and both are connected and controlled. In steps S102, S103, and S105, processing for connecting by cross-fade synthesis according to the data and offset processing are also performed. Which connection rule is used to perform the connection process differs depending on the corresponding connection control data.
[0132]
In step S104, basically, a process of reading a waveform (Timbre) template for a specific element specified by the element vector E-VEC is performed at a rate corresponding to the specified musical tone pitch. The tone pitch specified here is variably controlled by the pitch template (pitch control envelope value) formed in the previous step S103. In step S104, control for extending or compressing the existence time of the waveform sample data along the time axis, that is, TSC control, is performed in accordance with the time template (TSC template) independently of the musical tone pitch. In addition, the waveform sample data (two waveform sample data corresponding to different time points in the same waveform template) is read out in two series so that the continuity of the waveform is not impaired by the time axis expansion / contraction control. Is also performed in step S104. Also, as in the case of "normal performance", an interpolation calculation process between waveform samples is also performed in step S104. Further, as described with reference to FIG. 17, the waveform templates are read out in two series (corresponding to the predetermined element connection locations) (the waveform template of the preceding element and the waveform template of the subsequent element), and the two are cross-fade combined. The connection process is also performed in step S104. Further, as described with reference to FIGS. 13 to 16, the process of loop-reading (repeatedly reading) the waveform template and the process of cross-fading the two series of loop-read waveforms at this time are also performed in step S104. Do.
If the waveform (Timbre) template to be used retains the temporal pitch fluctuation component in the original waveform as it is, the value of the pitch template is given by the amount of change (difference value or ratio) with respect to the original pitch fluctuation. It is good to do. That is, when the original temporal pitch fluctuation is to be kept as it is, the value of the pitch template is maintained at a constant value (for example, “1”).
[0133]
In the next step S105, a process of forming an amplitude template is performed. In the next step S106, the volume level of the waveform data of one sample generated in step S104 is variably controlled by the amplitude envelope value formed in step S105, and this is controlled in the waveform buffer W-BUF indicated by the current address pointer. Add to the waveform sample data already stored at the address location. That is, addition and accumulation are performed on waveform sample data of another channel for the same sample point. Next, in step S107, it is determined whether or not processing for one frame has been completed. If not completed, the process goes to step S108 to prepare the next sample (the address pointer is advanced to the next).
As described above, when the waveform (Timbre) template to be used retains the temporal amplitude fluctuation component in the original waveform as it is, the value of the amplitude (Amp) template is the amount of change with respect to the original amplitude fluctuation. (Differential value or ratio). That is, when the original temporal amplitude fluctuation is left as it is, the value of the amplitude template is maintained at a constant value (for example, “1”).
[0134]
Next, an example of time axis expansion / contraction control (TSC control) will be described.
Waveform data having high quality, that is, a specific articulation characteristic composed of a plurality of periodic waveforms, and composed of a fixed amount of data (the number of samples or the number of addresses) is obtained independently of the musical tone reproduction pitch. To arbitrarily control the existence time length on the time axis without deteriorating the overall characteristics of the waveform, the time axis expansion and contraction proposed by the present applicant in another application (for example, Japanese Patent Application No. 9-130394) has been proposed. This can be realized by using control (TSC control). The point of this TSC control is that a multi-period waveform consisting of a fixed amount of waveform data is used to expand and contract the waveform data existence time length on the time axis while maintaining a constant reproduction sampling frequency and a predetermined reproduction pitch. In the case of compression, the appropriate portion of the waveform data is read out by skipping it, and in the case of expansion, the appropriate portion of the waveform data is read out repeatedly. In order to remove discontinuity, crossfade synthesis is performed.
[0135]
FIG. 31 is a diagram conceptually showing the outline of the time axis expansion / contraction processing (TSC control). (A) shows an example of a time template that changes with time. The time template is composed of data indicating a time axis expansion / contraction ratio (this is referred to as CRate). The vertical axis represents the data CRate, and the horizontal axis represents time t. The time axis expansion / contraction ratio data CRate indicates a ratio based on “1”. When “1” is set, the time axis is not expanded / contracted, and when larger than “1”, the time axis is compressed. When it is smaller than "1", it indicates the expansion of the time axis. (B) to (d) of FIG. 31 show an example of performing time axis expansion / contraction control according to the time axis expansion / contraction ratio data CRate using the virtual read address VAD and the real read address RAD. The solid line indicates the real read address RAD, and the broken line indicates the virtual read address VAD. (B) shows an example of time axis compression control according to the time axis expansion / contraction ratio data CRate (> 1) at point P1 in the time template of (a), and (c) shows the time template in the time template of (a). An example in which the time axis does not expand / contract according to the time axis expansion / contraction ratio data CRate (= 1) at the point P2 is shown. (D) shows the time axis expansion / contraction ratio data CRate (<1) at the point P3 in the time template of (a). The example of the time axis expansion control according to it is shown. In (c), the solid line indicates the progress of the original waveform read address according to the pitch information, and the actual read address RAD matches the virtual read address VAD.
[0136]
The actual read address RAD is an address used for actually reading waveform sample data from the waveform template, and changes at a constant change rate according to desired pitch information. For example, by accumulating frequency numbers corresponding to a desired pitch regularly, an actual read address RAD having a constant slope corresponding to the pitch can be obtained. The virtual read address VAD assumes a state in which a desired length of the waveform data on the time axis is controlled to be expanded or compressed, and in order to achieve the desired time axis expansion or compression, the waveform sample from which address position is currently selected. This address indicates whether data should be read. For this purpose, using the desired pitch information and the time axis expansion / contraction ratio data CRate, address data that changes with the inclination corrected by the expansion / contraction ratio data CRate according to the pitch information is generated as a virtual read address VAD. The real read address RAD is compared with the virtual read address VAD, and when the separation width of the real read address RAD from the virtual read address VAD exceeds a predetermined width, an instruction is given to switch the value of the real read address RAD. In accordance with the instruction, the numerical value of the real read address RAD is shifted and controlled by the appropriate number of addresses so as to eliminate the deviation of the real read address RAD from the virtual read address VAD.
[0137]
FIG. 33 is an enlarged view showing the same state as FIG. 31 (b). The dashed line illustrates the original address progression according to the pitch information, and corresponds to the solid line in FIG. The thick broken line illustrates the address progress of the virtual read address VAD. If the expansion / contraction ratio data CRate is 1, the address advance of the virtual read address VAD matches the original address advance of the dashed line, and there is no change in the time axis. When compressing the time axis, the expansion / contraction ratio data CRate takes an appropriate value of 1 or more, and the inclination of the address advance of the virtual read address VAD becomes relatively large as shown in the figure. The thick solid line illustrates the address progress of the actual read address RAD. The slope of the address progress of the actual read address RAD matches the original slope of the address progress according to the pitch information indicated by the dashed line. In this case, since the gradient of the address progress of the virtual read address VAD is relatively large, the address progress of the real read address RAD gradually becomes slower than the address progress of the virtual read address VAD over time. Then, when the width of the separation exceeds a predetermined value, a switching instruction (indicated by an arrow in the figure) is issued, and as shown in the figure, the actual read address RAD is shifted by an appropriate amount in the direction of eliminating the separation. Accordingly, the address progression of the real read address RAD changes along the address progression of the virtual read address VAD while maintaining a gradient according to the pitch information, and shows a characteristic compressed in the time axis direction. Therefore, by reading the waveform sample data of the waveform template in accordance with such an actual read address RAD, it is possible to obtain a waveform signal in which the waveform is compressed in the time axis direction without changing the pitch of the musical tone to be reproduced.
[0138]
FIG. 34 is an enlarged view showing the same state as FIG. 31 (d). In this case, the expansion / contraction ratio data CRate is less than 1, and the inclination of the address advance of the virtual read address VAD indicated by the thick broken line is relatively small. Therefore, when the address progress of the real read address RAD gradually advances with the passage of time and the address progress of the virtual read address VAD becomes greater than a predetermined width, a switching instruction (indicated by an arrow in the figure) is issued. As shown in the figure, the actual read address RAD is shifted by an appropriate amount in the direction to eliminate the separation. As a result, the address progress of the real read address RAD changes along the address progress of the virtual read address VAD while maintaining a gradient according to the pitch information, and exhibits a characteristic that is extended in the time axis direction. Therefore, by reading the waveform sample data of the waveform template in accordance with such an actual read address RAD, it is possible to obtain a waveform signal whose waveform has been expanded in the time axis direction without changing the pitch of the musical tone to be reproduced.
[0139]
It is preferable that the shift of the actual read address RAD in the direction of eliminating the separation is such that the waveform data read immediately before the shift and the waveform data read immediately after the shift are smoothly connected by this shift. In addition, as shown by a dashed line in the drawing, it is preferable to perform crossfade synthesis in an appropriate period at the time of switching. The dashed line indicates the address progress of the cross-fade sub-system actual read address RAD2. As shown in the drawing, the actual read address RAD2 for the cross-fade sub-system has the same rate as the actual read address RAD (that is, the slope) when the above switching instruction is issued and the address advance of the actual read address RAD before the shift is extended. ). In an appropriate cross-fade period, cross-fade synthesis is performed so that the waveform smoothly transitions from the waveform read in accordance with the sub-system real read address RAD2 to the waveform read in accordance with the main sequence real read address RAD. You. In the case of this example, the sub-system actual read address RAD2 may be generated only at least during a required cross-fade period.
In addition, the TSC control is not limited to the example of the TSC control in which the cross-fade synthesis is partially performed as described above, and the TSC control in which the cross-fade synthesis processing in a mode corresponding to the value of the time axis expansion / contraction ratio CRate is always performed may be adopted. .
[0140]
In a case where waveform sample data is generated by repeatedly reading out a waveform template (that is, a loop waveform) of the partial vector PVQ as shown in FIGS. 13 to 15, basically, the comparison is performed by changing the number of loops. The time length of the entire loop readout waveform can be variably controlled independently of the musical sound reproduction pitch. That is, when a specific crossfade curve is specified by the data specifying the crossfade section length, the crossfade section length (time length or number of loops) is determined accordingly. Here, the inclination of the crossfade curve is variably controlled by the time axis expansion / contraction ratio indicated by the time template, so that the speed of the crossfade is variably controlled, and as a result, the time length of the crossfade section is variably controlled. During this time, the tone reproduction pitch is not affected, so that the loop length is variably controlled, so that the time length of the cross-fade section is variably controlled.
[0141]
By the way, when the existence time of the reproduction waveform data on the time axis is controlled by the time axis expansion / contraction control, it is desirable to control the time axis of the pitch template and the amplitude template in accordance with the expansion / contraction control. Therefore, in steps S103 and S105 in FIG. 30, the time axis of the pitch template and the amplitude template created in the step is controlled to expand and contract according to the time template created in step S102.
[0142]
It should be noted that all of the tone synthesis functions may not be constituted by software sound sources, but may be of a hybrid type of software sound sources and hardware sound sources. Further, the tone synthesis processing according to the present invention may be performed only by the hardware tone generator. Alternatively, the tone synthesis processing according to the present invention may be performed using a DSP (Digital Signal Processor). When using a software tone generator, a hardware tone generator, or a hybrid tone generator, the waveform forming method is not limited to the simple PCM waveform memory reading method, and various data compression techniques are used as described above. An appropriate method such as a method based on the above-mentioned method or a method based on parameter calculation according to various waveform synthesis algorithms can be used.
[0143]
【The invention's effect】
As described above, according to the present invention, for each of a plurality of performance phrases accompanied by musical articulation, one or more sounds constituting the performance phrase are divided into a plurality of partial time sections, and A first database unit that stores an articulation element sequence that sequentially indicates articulation elements for each of the time periods; and a second database unit that stores template data that represents partial sound waveforms corresponding to various articulation elements. When performing data editing of the tone database including the second database unit, a desired rendition style is specified, and an articulation element sequence corresponding to the specified rendition style is searched from the first database unit. When editing music data, The user can search for an articulation element sequence corresponding to the specified playing style by designating the desired playing style, and search whether or not the desired playing style is available in the musical tone database. The effect is excellent. Therefore, for example, if the desired playing technique is available in the musical tone database, it is read out and its contents are confirmed, and if it is different from the desired one, the contents are appropriately modified / changed to obtain the desired musical tone data. Can be made to be created. On the other hand, if the desired playing style is not available in the tone database, editing work such as newly creating tone data corresponding to the playing style and registering it in the tone database can be performed. As described above, the editing operation of the musical sound data can be advanced in a form suitable for the performance sensation such as designating a desired playing style, and it becomes easy for the user to use.
[0144]
If an articulation element sequence that matches the specified playing style is not detected, an articulation element sequence similar to the specified playing style is selected from the first database unit and selected. By changing, replacing, or deleting any of the articulation elements that make up the articulation element sequence that has been added, or adding a new articulation element to the sequence. Even if the articulation element sequence corresponding to the performed technique is not stored in the database, the content can be freely edited using the articulation element sequence similar to the specified performance technique. It makes it possible to easily create tone data corresponding to a desired rendition style demonstrates an excellent effect of.
[0145]
Furthermore, by setting the connection method of each template data between the edited articulation element and the articulation element adjacent thereto, as a result of the editing operation, the articulation element to be edited is When the content of the template data representing the corresponding partial sound waveform is changed, the connection of the template data with the articulation element adjacent thereto may be lost. By setting (that is, re-defining) the method of (1), there is an excellent effect that adjacent template data can be smoothly connected.
[0146]
As described above, according to the present invention, the content of the articulation element can be freely edited, so that a high-quality musical sound waveform including "articulation", which has not existed in the past, can be interactively created by the user. This makes it possible to achieve an excellent effect that the control can be performed while realizing the control. Further, there is an excellent effect that it is possible to provide an interactive high-quality musical sound waveform generation technology that allows a user to freely create and edit sounds in an electronic musical instrument, a multimedia device, or the like.
[Brief description of the drawings]
FIG. 1 is a flowchart showing an example of a procedure for creating a tone database according to a tone data creating method according to the present invention.
FIG. 2 is a diagram schematically showing an example of a musical score of a series of music phrases, an example of division of a performance section corresponding to each articulation unit, and an example of analysis of musical tone elements constituting an articulation element.
FIG. 3 is a diagram showing a specific example of a plurality of musical tone elements analyzed from a waveform corresponding to one articulation element.
FIG. 4 is a diagram showing a configuration example of a database.
FIG. 5 is a diagram showing a specific example of an articulation sequence AESEQ and an articulation element vector AEVQ in the articulation database ADB of FIG. 4;
FIG. 6 is a diagram showing a specific example of an articulation element vector AEVQ including attribute information.
FIG. 7 is a flowchart showing an example of a tone synthesis procedure according to the tone data creation method according to the present invention.
FIG. 8 is a diagram showing a configuration example of automatic performance sequence data employing a tone synthesis method according to a tone data creation method according to the present invention.
FIG. 9 is a diagram showing specific examples of some rendition style sequences according to the present invention.
FIG. 10 is a diagram showing an example of connection processing by cross-fade synthesis between articulation elements in one rendition style sequence.
FIG. 11 is a diagram showing an overview of an example of editing a rendition style sequence (articulation element sequence).
FIG. 12 is a flowchart showing an example of an editing tree of a playing style sequence (articulation element sequence).
FIG. 13 is a view showing a concept of a partial vector.
FIG. 14 is a flowchart partially showing a musical sound synthesis processing procedure of an articulation element including a partial vector.
FIG. 15 is a diagram illustrating an example of a vibrato synthesis process.
FIG. 16 is a diagram showing another example of the vibrato combining process.
FIG. 17 is a diagram illustrating some rules of a connection example of a waveform template.
FIG. 18 is a diagram showing some rules of a connection processing example of template data (envelope waveform-shaped template data) other than a waveform template. .
FIG. 19 is a view showing some concrete means of the connection rule shown in FIG. 18 (b).
FIG. 20 is a diagram showing some specific means of the connection rule shown in FIG. 18 (c).
FIG. 21 is a block diagram schematically showing a connection process of various template data and a tone synthesis process based on the template data.
FIG. 22 is a block diagram showing an example of a hardware configuration of a musical sound synthesizer according to the embodiment of the present invention.
23 is a block diagram showing a detailed example of a waveform interface in FIG. 22 and a configuration example of a waveform buffer in a RAM.
FIG. 24 is a time chart showing an outline of a musical sound generation process executed based on MIDI performance data.
FIG. 25 is a time chart showing an outline of a rendition style performance process (articulation element musical sound synthesis process) executed based on data of a rendition style sequence (articulation element sequence AESEQ).
FIG. 26 is a flowchart showing a main routine of a tone synthesis process executed by the CPU of FIG. 22;
FIG. 27 is a flowchart showing an example of “automatic performance processing” in FIG. 26;
FIG. 28 is a flowchart illustrating an example of “sound source processing” in FIG. 26;
FIG. 29 is a flowchart showing an example of “1 frame waveform data generation processing” for “normal performance” in FIG. 28;
30 is a flowchart illustrating an example of “1 frame of waveform data generation processing” for “performance style performance” in FIG. 28;
FIG. 31 is a diagram conceptually showing an outline of a time axis expansion / contraction process (TSC control).
FIG. 32 is a view for explaining a hierarchical structure of a rendition style sequence.
FIG. 33 is a diagram showing an example of a temporal progress state of a waveform read address when time axis compression is performed by time axis expansion / contraction control.
FIG. 34 is a diagram showing an example of a temporal progress state of a waveform read address when the time axis is expanded by time axis expansion / contraction control.
[Explanation of symbols]
ADB Articulation Database
TDB template database
10 CPU
11 ROM (Read Only Memory)
12 RAM (random access memory)
13 Hard disk drive
14,15 Removable disk device
16 Display
17 Input operation device 17 such as keyboard and mouse
18 Waveform interface
19 Timer
20 Network Interface
21 MIDI Interface
22 Data and address bus

Claims

For each of a plurality of performance phrases accompanied by musical articulations, one or more sounds constituting the performance phrases are divided into a plurality of partial time sections, and the articulation element for each partial time section is divided. A tone database comprising: a first database unit for storing articulation element sequences sequentially designated; and a second database unit for storing template data representing partial sound waveforms corresponding to various articulation elements. A data search method for storing the attribute information attached to the articulation element sequence and indicating the characteristics of the sequence.
A first step of specifying attribute information according to a desired playing style;
A second step of searching the first database unit for an articulation element sequence corresponding to the desired performance style according to the designated attribute information.

For each of a plurality of performance phrases accompanied by musical articulations, one or more sounds constituting the performance phrases are divided into a plurality of partial time sections, and the articulation element for each partial time section is divided. A database unit that stores an articulation element sequence that is sequentially instructed, and that stores attribute information indicating a characteristic attached to the articulation element sequence;
First means for specifying attribute information according to a desired playing style;
A second means for searching the database unit for an articulation element sequence corresponding to the desired playing style according to designated attribute information.

For each of a plurality of performance phrases accompanied by musical articulations, one or more sounds constituting the performance phrases are divided into a plurality of partial time sections, and the articulation element for each partial time section is divided. A computer-readable storage medium storing a program, which is executed by a computer to search for data in a database unit storing an articulation element sequence to be sequentially designated, wherein the articulation is performed in the database unit. included with the element sequence stores the attribute information indicating the feature in the program is the computer,
A first step of receiving input of attribute information corresponding to a desired playing style to the computer;
In a recording medium used for executing the second step you find the articulation element sequence corresponding to the desired style-of-rendition from the database unit in response to the inputted attribute information.