JP3979623B2

JP3979623B2 - Music synthesis system

Info

Publication number: JP3979623B2
Application number: JP2001188471A
Authority: JP
Inventors: ザ・サードジュリアス・オー・スミス，
Original assignee: Leland Stanford Junior University
Current assignee: Leland Stanford Junior University
Priority date: 1994-09-01
Filing date: 2001-06-21
Publication date: 2007-09-19
Anticipated expiration: 2015-09-01
Also published as: JPH08179777A; JP2001356776A; JP3226255B2

Description

【０００１】
【発明の属する技術分野】
この発明は、楽音合成技術に関し、特に、自然楽器のメカニズムに従って楽音を合成する“物理モデル式合成（physical-modeling synthesis）”として知られている楽音合成技術に関する。
【０００２】
【従来の技術】
今日、物理モデルに基づく楽音合成は、例えば“サンプリング”（または、“波形テーブル”）合成およびＦＭ合成のような現在の主流をなす楽音合成方法と並んで、一般的に利用されている。このような物理モデルに基づく楽音合成は、特に、吹奏楽器および弦楽器のシミュレーションに特に有用である。自然楽器における楽音発生上の物理的現象を正確にシミュレートすることによって、電子楽器は高品質の楽音を発生できる。
【０００３】
弦楽器の場合、楽音を合成するための構造は、典型的には、フィルタ付きの遅延ループ、すなわち、発生すべき楽音の１周期に対応する長さの遅延を実現する閉ループと、閉ループに含まれたフィルタとを備えている。前記閉ループには励振信号が入力され、該閉ループ内を循環する。こうして、該閉ループの出力信号を、楽音信号として取り出すことができる。この信号は、前記フィルタの特性に従って減衰する。また、前記フィルタは、弦における減衰、および、弦の終端部（例えば、ギターのナットおよびブリッジ）における減衰をシミュレートするものである。
【０００４】
実際の弦楽器において、弦は共鳴体すなわち共振部に音響的に結合されており、該弦の物理的振動は前記共振部を励振する。そこで、自然楽器を正確にシミュレートするためには、フィルタ付きの遅延ループの出力側にフィルタを設けることが必要であった。また、高品質の楽音を得るには、楽器本体をシミュレートする大きくて高価なフィルタによって、弦の出力を模する必要があった。一般に、前記励振信号は、ホワイトノイズまたはフィルタ処理されたホワイトノイズである。代案として、前記閉ループに対して、物理的に正確な“プラック（爪弾き）”音の波形を励振信号として与えてもよく、このようにして、より正確に弦の爪弾き音をシミュレートできる。
【０００５】
上述した従来の楽音合成システムは、図１に示されている。フィルタ付きの遅延ループは、遅延素子１０とローパスフィルタ１２とで構成されている。励振源（例えば、励振テーブル）１４は、加算器１６を介して、前記遅延ループに励振信号を与える。前記励振テーブルの内容は、例えば押鍵に応じて発生されるトリガ信号に応答して、メモリテーブルから自動的に読出し可能である。前記フィルタ付きの遅延ループに入力される励振信号は、該ループを循環し、前記フィルタ１２の動作によって時間的に変化する。こうして、前記遅延ループから信号が取り出され、本体フィルタ１８に与えられる。高品質の楽音合成を行うためには、複雑で高価な本体フィルタ（典型的には、ディジタルフィルタ）または追加のフィルタ付き遅延ループが必要である。
【０００６】
１つまたは複数のディジタル信号処理（ＤＳＰ）用のチップを使用して、楽音発生をソフトウエアによって実現することの方がより一般的であるが、図１に示した従来の楽音合成システムは、ハードウエアによっても実現可能である。
【０００７】
上記従来の楽音合成システムは、極めて高品質の楽音合成を行うことが可能であるが、楽器本体をシミュレートするために複雑で高価な本体フィルタを必要とする。
【０００８】
【発明が解決しようとする課題】
この発明は、簡単に且つ低コストで、高品質の楽音を合成できる物理モデル式の楽音合成システムを提供しようとするものである。
【０００９】
【課題を解決するための手段】
この発明に係る楽音合成システムは、第１および第２の励振信号を発生する励振手段と、前記第１の励振信号を受け取る入力部と信号を遅延する遅延部とを閉ループ接続してなり、かつ、該閉ループから出力を取り出す出力部を含み、この閉ループにおける遅延量が合成すべき楽音の音高に対応している閉ループ手段であって、該閉ループ手段のサンプリングレートは前記第２の励振信号のサンプリングレートよりも低くしてなるものと、前記閉ループ手段の前記出力部から取り出された信号のサンプリングレートを前記第２の励振信号のサンプリングレートに一致させる手段と、前記出力部から取り出されてサンプリングレートが前記第２の励振信号に一致せしめられた前記信号と前記第２の励振信号とを合成する合成手段とを具備することを特徴とするものである。
【００１０】
これにより、第１の励振信号に基づき閉ループ手段で合成された信号と第２の励振信号との合成により、楽音信号が生成されるので、閉ループ手段の構成を簡単化することができる。
【００１１】
また、前記閉ループ手段のサンプリングレートを前記第２の励振信号のサンプリングレートよりも低くし、更に、前記閉ループ手段の前記出力部から取り出された信号のサンプリングレートを前記第２の励振信号のサンプリングレートに一致させる手段を具備することを特徴とする。
【００１２】
これにより、第２の励振信号のサンプリングレートを相対的に高くして品質の良い楽音信号を得ようとする場合において、閉ループ手段のサンプリングレートを相対的に低くしても、その後のサンプリングレートを一致させる処理によって高サンプリングレートの楽音信号生成を問題なく行うことができるので、閉ループ手段のサンプリングレートを相対的に低くすることで、該閉ループ手段の構成の簡単化および低コスト化を図ることができる。
【００１３】
【発明の実施の形態】
以下、添付図面を参照してこの発明の一実施の形態を説明する。
以下に説明する本発明は、様々な遅延回路及びフィルタ等を含むハードウエアの形態、若しくは、例えばＤＳＰで実行される適当なアルゴリズムを使用するソフトウエアの形態、のいずれの形態によって実現されるようになっていてもよいものである。
【００１４】
図１に示したような従来のフィルタ付きの遅延ループおよび本体フィルタを含むシステムにおいては、これらのフィルタ付き遅延ループ及び本体フィルタは共に、リニアで、時間変化する構成要素である。従って、これらの要素の配置順序を逆にすることが可能である。そこで、前記本体フィルタを前記フィルタ付きの遅延ループの前に配置替えすることによって、等価のシステムを実現できる（例えば図２）。このような配置替えした構成における励振信号発生器の出力は、前記本体フィルタに直接与えられる。ここで、弦が爪弾きされたまたは叩き弾きされた場合には、該弦による励振が一般的にインパルスの形態をとるということを認識することによって、前記本体フィルタの出力信号、すなわち、前記遅延ループに与えられる励振信号が前記本体フィルタのインパルス応答を表すことになる、ということが確められた。
【００１５】
この点に鑑みて、以下説明する実施例においては、このインパルス応答が測定され、この測定されたインパルス応答が集合励振信号として格納される。このようにして、前記本体フィルタが除去可能になり、前記集合励振信号が前記フィルタ付きの遅延ループに直接に与えられる。このように、前記本体フィルタのインパルス応答に対応する適当な励振信号を与えることによって、高価な本体フィルタを必要とすることなく、高品質の楽音が合成されることができる。
加えて、閉ループの入力部に対して供給される励振信号は、この発明に従ってモデルしようとする共振部材若しくは共振システムにおける第１の部分的応答に対応する成分を有する信号であり、該共振部材若しくは共振システムにおける第２の部分的応答に従う共振特性は、閉ループの出力信号に対して、共振フィルタ手段若しくは共振付加手段によって付加される。このように、モデルしようとする共振部材若しくは共振システムにおける総合的な応答を部分的応答に分離し、閉ループの前後で分担させる構成であるため、励振手段と共振フィルタ手段若しくは共振付加手段の構成を簡単化することができ、その設計と製造コストも低廉にすることができる。
【００１６】
本発明の１つの実施の形態にあっては、従来必要とされた複雑で高価な本体フィルタを除去できるとともに、集合励振信号を格納するために必要なテーブルのサイズを小さくすることもできる。共振部をダンプモードとリンギィ（ringy:鳴り響く）モードとに分解することによって、かつ、ダンプモードのインパルス応答のみを使用して集合励振信号を設定することによって、励振テーブルのサイズを小さくすることができ、従って、集合励振信号を格納するために必要なメモリのサイズを小さくできる。
【００１７】
前記励振信号発生器は、固定された単一の励振信号として、または、合成励振信号を形成するために組合わせられる複数の励振信号として実施されてよい。前記複数の励振信号の各々を制御可能に重み付けすることによって、多数の異なる合成励振信号が提供可能になる。さらに、様々な励振信号は時間変化するよう制御可能であり、そうすれば、固定された１組の励振信号を使用するにも関わらず、意義ある制御性および楽音変化を実現することができる。
【００１８】
図１は、遅延素子１０とフィルタ１２とを含むフィルタ付きの遅延部と、ギターのような自然楽器の共鳴体すなわち共振部をシミュレートするディジタルの本体フィルタ１８とを具備した従来のフィルタ付きの遅延ループを示す図である。励振源１４は、前記遅延ループに励振信号を供給する。この発明の発明者は、前記フィルタ付きの遅延ループおよび本体フィルタ１８が、基本的に、リニアな時間変化するシステムである、ということを認識した。このため、結果として発生される楽音の特性に変化をきたすことなく、図２に示すように、前記フィルタ付きの遅延ループおよび本体フィルタ１８の順序を逆にすることが可能である。
【００１９】
すなわち、図２において、前記本体フィルタ１８は、前記フィルタ付きの遅延ループの前に設けられている。全体的な処理要件は同じであるので、この順序の変更自体は意義ある利点をもたらすものではない。しかし、従来の技術に示されるように、弦をシミュレートするための変数が横方向の加速波となるよう選択される場合、理想的な弦の爪弾き音はインパルスとなる。この場合、弦を爪弾きするための前記励振テーブルの出力は、各爪弾きごとに、その前後がゼロで挟まれた単一のゼロではないサンプル、すなわち、インパルスである。その結果、前記フィルタ付きの遅延ループを励振するものは、前記本体フィルタ１８のインパルス応答である。前記本体フィルタ１８は１つの音の発生中に変化しないので、本体のインパルス応答は固定される。この発明は、この事実に着目して、本体フィルタを設ける必要性を完全に除去することを意図するものである。本体フィルタにインパルスを通過させる代りに、励振テーブルには、所望の本体フィルタのインパルス応答を表す集合励振信号がロードされている。このようにして、共振する楽器本体に対する弦の接続、または、その他の結合構造をシミュレートするために必要であった高価な本体フィルタ（または、ＤＳＰシステムにおけるフィルタ処理）を不要にすることができる。
【００２０】
図３は、この発明に係る楽音合成システムの一構成例を示す図である。この楽音合成システムは、図３に示す例では、トリガ信号（例えば、キーオン信号）２２に応答して集合励振信号ｅ（ｎ）を供給するテーブル２０からなる励振源を備えている。前記集合励振信号は、加算器２４を介してフィルタ付きの遅延ループに与えられる。該遅延ループは、長さＮが可変の遅延ライン２６と、ループフィルタ２８とを備えている。前記遅延ライン２６の出力は、楽音合成出力ｘ（ｎ）として取り出されるとともに、ループフィルタ２８に戻される（当該技術において知られているように、多数の出力を取り出すことができる）。前記ループフィルタ２８の出力ｙ（ｎ）は、前記加算器２４にフィードバックされる。前記遅延ライン２６の長さＮは、粗い音高制御を実現する。
【００２１】
前記ループフィルタ２８は、きめ細かな音高制御を実現し、１つの演奏音の変化を決定する。このフィルタは、通常、１つの音のデュレーション（継続時間）の間固定されるが、１つの音の発生中において、演奏者の手によってダンピング、２段の振幅エンべロープ減衰（例えば、ピアノ音）、他の弦とのカプリングによる振幅エンべロープのうなり、前記音の名目上のカットオフ時間の後に小さな減衰振幅エンベロープが持続する擬似リバーブレーション、その他の時間変化する効果のような各種効果を発生するために変化可能である。前記励振信号は、ギターの爪弾きをシミュレートする場合にピックが位置する箇所における物理的励振および本体フィルタに起因する細部を含む、楽音の初期的なスペクトル成分を決定する。
【００２２】
ｆ（ｎ）（ｎは、0，1，2，...，Nf-1に等しい変数）として表される前記ループフィルタのインパルス応答（ＩＲ）は、湾曲や空気抵抗による振動中の弦における損失、および、楽器本体に対する弦の結合による損失によって決定される。特定のループフィルタ特性の決定については既知であるので、ここでは詳細に説明しない。前記インパルス応答ｆ（ｎ）は、前記弦と本体との接続若しくは結合による理論的な損失に関する基本的な物理に関する式によって得られる。前記弦の材料、テンション（張力）および直径を使用して、弦の単位長さ当りの損失を理論的に推定することがができる。例えばギターのブリッジのような本体接続点での損失は、前記ブリッジの形状および楽器本体の共振から推定することができる。また、前記インパルス応答ｆ（ｎ）は、実際の弦楽器の弦についての物理的測定値から求めることもできる。さらに、数式および実際の物理的測定値に基づいた推定値の組合わせを使用してもよい。
【００２３】
多くの異なる方法を用いて、前記励振信号ｅ（ｎ）を設定することができる。この励振信号ｅ（ｎ）は、弦の物理的な励振の性質、および、弦による励振点に対する当該楽器の応答の両方によって設定される。例えば、ギターの場合、楽器本体に対する励振は、ギターのブリッジにおいて発生する。図４はギターの物理的ブロック図であり、この図において、弦３２に励振信号３０が加えられることによって、該弦３２が共振部（ギター本体）３４を励振するようになっている。物理的システムにおいて、共振部は出力信号を選び出すことによって設定される。典型的な例では、ギター本体の表板から数フィート離れた箇所の出力信号を選び出す。実際、このような信号は、所望の出力ポイントで保持されたマイクロホンを使用し、フォースハンマーによってギターのブリッジを叩く操作に対するその出力ポイントでの応答を記録することによって測定可能である。なお、前記共振部は、ギター本体自体の共振特性のみならず、空気の伝送特性をも含むものである。反響ルームにおいてギターから遠く離れた出力ポイントが選択される場合には、測定がなされる前記ルームの共振特性も含まれる。図５には、このような共振部の集合的な特性が示されている。
【００２４】
全体的な共振部３４は、ブリッジ接続 (bridge coupling) ３６、ギター本体３８、空気吸収４０およびルーム応答４２とを含む。一般的に、前記共振部のインパルス応答をできるだけ短くできるよう、前記ギターに比較的近い出力を選択するのが好ましい。しかしながら、すべての下流側のフィルタ処理を単一の共振部に組み込むことができることによって提供される普遍性は、この発明の重要な特徴である。これは、共振板と囲いとが１つの共振部として組み合わされる図６のピアノモデルの場合、より明白である。この場合、全体的な共振部３４は、ブリッジ接続４４、ピアノの共鳴板４６、ピアノの囲い４８および空気／ルーム応答５０で構成される。
【００２５】
前記共振部の構成要素に関する唯一の技術的な要件は、これらの要素がリニアで時間変化する特性を有するものである、ということである。上述の如く、これら２つの特性は、これらの要素を任意の配置順序で設けてよい、ということを意味する。前記弦もまたリニア特性および時間変化特性を有する場合、前記共振部と弦とは、図７に示すように配置順序を逆にしてよい。実際、前記弦は、ほとんどすべての弦楽器の中で最もリニア特性が小さな要素であるが、その非リニア性による主な効果は、基本的な振動周波数が振幅と共にわずかに上昇することである。このように配置順序を逆にする目的のためには、前記弦は十分なリニア特性を有するものと考えることができる。前記弦はビブラートが存在する場合でも時間変化特性を有するが、これも二次的な効果である。ゆっくりと時間変化する弦および共振部の配置順序を変更することによる結果は、数学的には同じではないが、発生される楽音は基本的に同じに聞こえる。
【００２６】
図７に示すように前記弦および共振部の配置順序を逆にした後、次のステップとして、図８に示すように励振信号と共振部とを組合わせることによって、集合励振信号５２を得る。この集合励振信号５２は、図７に示した共振部の出力と基本的に同じ出力ａ（ｎ）を提供するよう設定される。このためには、先ず、励振特性を特定しなければならない。最も単純な例は、インパルス応答である。物理的には、これは、加速波をモデルするために弦が使用される場合、最も適当な選択であろう。この例の場合、理想的な爪弾きは、前記弦に入力される加速インパルスを発生する。この単純な例において、前記集合励振信号５２は、単に、選択された共振部のサンプルされたインパルス応答である。
【００２７】
より複雑な例において、励振信号をｅ（ｎ）、共振部のインパルス応答をｒ（ｎ）とした場合、等価の集合励振信号ａ（ｎ）は、下記の数式（数１）に示されたｅ（ｎ）とｒ（ｎ）との畳み込みによって与えられる。
【数１】

【００２８】
前記集合励振信号が長い場合、なんらかの技術によってこれを短くすることが望ましい。このためには、信号処理に関する様々な参考文献に記載されているように、先ず、前記信号ａ（ｎ）を最小の位相に変換することが有用である。こうして、オリジナルのマグニチュード・スペクトルに合致した最大の短縮化が実現される。そして、前記信号ａ（ｎ）は、例えばスペクトル分析に使用される様々なウィンドウ関数のいずれかの適当な部分を使用することによって、ウィンドウ処理可能である。有用なウィンドウの一例は指数関数ウィンドウである。というのは、指数関数ウィンドウは、共振部のダンピング率を均等に増加できるという効果を有するからである。
【００２９】
図９に示すように、励振信号は、楽器から発生される音（例えば、弦の爪弾き音）を記録し、弦のループによる成分を除去するために逆フィルタ処理を行うことによって、設定してもよい。図９において、弦ループフィルタは、様々な方法の１つによって設定され、逆フィルタ内に含まれている。その結果としての出力は、爪弾きおよび本体フィルタに対応する成分を含んでおり、励振信号として（または、変更された励振信号を得るための基準信号として）使用可能である。
【００３０】
図１０は、自然楽器の典型的な本体フィルタのインパルス応答を示す図である。基本的に、このインパルス応答は、ダンプ振動波形である。励振信号がインパルスである最も単純な例において集合励振信号として格納されるのは、このような応答である。励振信号がインパルス以外である他の例において、前記集合励振信号は、上述したような畳み込み結果であろう。この畳み込みはインパルス応答によるものであるので、いずれの場合も、畳み込み結果はダンプ振動波形で終わる。しかし、様々な短縮技術によって、ダンプ振動波形以外の波形を有する励振信号が提供される。このような短縮化された励振信号は、オリジナルのインパルス応答から得られる（且つ、同様な結果を前記オリジナルのインパルス応答に提供する）。
【００３１】
前記楽音合成システムは、異なるピック位置、すなわち、弦に沿った異なる位置での励振信号の入力をシミュレートするために使用可能である。前記遅延ラインに沿った２つの異なる位置において弦を同時に励振し、遅延ループのそのポイントに存在している成分に加算することによって、弦上の特定のピック位置がシミュレートされる。これは図１３に示されており、ここにおいて、遅延回路は２つの遅延回路５４、５６に分割されており、これら遅延回路５４、５６の間に加算器５８が挿入されている。一般的に、ループ全体での遅延時間に対するピック位置での遅延時間の比率は、弦の長さに対するピック位置の比率に等しい。遅延回路５４，５６による合計遅延長さＮは、選択された音高に対応する所望の楽音周期に対応している（ただし、Ｎ＝「音高に対応する遅延量」−「ループフィルタの遅延量」）。ここで、所望のピック位置に対応して遅延回路５４の遅延量Ｐが可変でき、これに伴い遅延回路５６における残余の遅延量Ｎ−Ｐを可変する。
【００３２】
図１４は、上記に関連した技術として、励振信号を遅延し、遅延されていない励振信号と加算することによって、図１３と基本的に同じ効果を実現するようにした例を示す。図１４においては、図１３とは異なり、遅延ループとは別に、ピック位置遅延回路６０と加算器６２とが設けられている。上記と同様に、前記ピック位置遅延回路６０は、弦における実際の爪弾きポイントを制御するために変化可能である。
【００３３】
この発明に係る楽音合成システムは、自然楽器における多くの音放出ポイントの効果を実現するために、多くの励振信号を提供できるよう変更されてもよい。木製および金属製の楽器に耳を傾ける人（リスナー）は、前記楽器上の多くの音放出面からの信号を受け取る。従って、両方の耳に異なる信号が到達する。さらに、演奏者が前記楽器を動かすか、または、リスナーが頭を動かすと、前記楽器から放出される混合音が動的に変化する。このような自然現象を扱うためには、自然環境における異なる出力信号に対応する多くの出力信号を生成できるようにすることが有用である。この発明によると、これは、図１５に示すように、各々が異なる本体フィルタまたは異なる全体的な共振システムを反映した異なる成分を有する、多くの集合励振信号を供給することによって、簡単にシミュレート可能である。図１５において、集合励振信号６４、６６が供給され、単一の弦遅延ループ６８に与えられる（個別の出力が所望の場合、個別の弦ループを設けてよい）。２つの集合励振信号のみが図示されているが、複数の異なる出力ポイントでのクロスフェードをシミュレートするために、任意数の励振信号が提供されてもよい。２つまたは３つ以上のテーブルの間の補間が使用されてもよい。２つまたは３つ以上の集合励振信号６４，６６を適宜補間して単一の弦遅延ループ６８に与えるようにしてもよい。
【００３４】
前記楽音合成システムにおける重要な変更点は、励振テーブルを準定期的に読み出すことである。弦の爪弾き音を開始するために単一のトリガ信号を与えることに代えて、トリガ信号は定期的（または、ビブラートを考慮して、略定期的に）に与えられる。この例においては、適当な出力レベルを提供できるよう、（例えば、テーブルの出力値を右シフトすることにより、または、テーブルの出力値に振幅エンべロープを付与することによって）励振信号の振幅を小さくできる。この技術は、極めて高い品質の弓弾き弦をシミュレートすることができる。
【００３５】
前記励振テーブルが読み出されている間にトリガ信号が発生される場合、２つの変形が可能である。先ず、前記励振テーブルは最初から再スタートされてよく、このようにして、進行中の再生を中断してよい。このことは、図１２に示されている。また、新たな励振テーブルの再生スタートは、図１１に示すように、進行中の再生とオーバラップさせてもよい。この変形は、励振テーブルの再生ごとに、個別の実行ポインタと加算器とを必要とするので、より複雑なものとになる。しかし、品質から見た場合、この方がより好ましい。
【００３６】
有用な変更例では、図１５のような混合された励振信号を提供することに加えて、図１６に示すように、複数の励振信号（テーブルまたはその他）を提供し、時間的に変化可能な各励振信号ごとにゲイン制御を提供する。図１６において、励振信号発生器７０は、Ｍ個の励振信号を発生する。各励振出力は、時間的に変化されることが可能なゲイン制御要素７２を有する。集合励振信号ａ（ｎ）を提供するため、これらゲイン制御要素７２の出力は、加算器７４によって組合わせられる。この信号は、加算器８０を介して、遅延ライン７６およびループフィルタ７８を含む遅延ループに与えられる。このように各励振信号ごとにゲイン制御要素を設けることによって、広い範囲の励振信号を、固定された励振信号についての時間変化するリニアな組合わせとして合成する手段が提供される。すなわち、各励振信号は固定されているが、各励振信号の相対的なゲインを制御することによって、前記遅延ループに与えられる合計励振信号に対する相対的な寄与率を制御可能である。前記ゲインは、特定の値に設定され、１つの音の継続時間にわたって保持されてよいし、または、前記フィルタ付きの遅延ループ自体による変化に加えてさらに発生中の楽音の特性を変化させるために、時間的に変化されてもよい。
【００３７】
例えば爪弾き音のような自由な振動において、前記励振信号の１つのリニアな組合わせのみが使用されるよう、典型的には、前記ゲインｇｉ（ｎ）が固定される。一方、例えば弓弾きされる弦のような駆動振動においては、楽音の特性を変化させるために、前記ゲインｇｉ（ｎ）を時間的に変化することができる。これは、各励振信号ごとに滑らかに変化するエンベロープを供給して、複数の異なる励振信号の相対的な寄与率を制御することによって実現可能である。前記励振信号を変化させることによって実現される時間変化は、前記フィルタ付きの遅延ループ内で実現される時間変化に付加される。
【００３８】
様々な励振テーブルの特性は、固定された１組のテーブルから実現できる有用な変更の数を最大化できるよう選択可能である。例えば、１組の励振テーブルは、フイルタを備えたノイズ発生器の他に、ＲＯＭに格納された多数の波形テーブルを含んでいてよい。該波形テーブルは、異なる本体フィルタを考慮した様々な集合励振信号を提供することができ、または、全体的な所望の励振信号の主な要素（例えば、周波数）が異なる波形テーブルで別々に提供され、可変に組み合わされる主な要素についての分析に基づいてたものでもよい。これは、標準的な楽音発生に使用される（しかし、遅延ループ楽音合成のための励振信号発生には使用されない）周知のフーリエ合成に類似している。
【００３９】
図１６に示した楽音合成システムは、弓弾きされる弦の音をシミュレートするために有用である。一般的に、このような音の正確なシミュレーションには、励振信号およびそのループを循環する信号を取り込むための非リニア性のジャンクションを有し、非リニア関数に従って信号をフィードバックする遅延ループが必要である。しかし、図１６の楽音合成システムは、前記非リニア性のジャンクションを必要せず、それにも関わらず、フィルタ付きの遅延ループおよび時間変化する励振信号のみを使用することによって、弓弾きされる弦の高品質のシミュレーションを実現できる。なお、この点に関し、各励振信号自体は、時間変化するものであるが、比較的短い固定された持続時間を有するものである。弓弾きされる弦のシミュレート音のような持続した楽音を発生するには、各励振信号が複数回反復され、各励振信号の相対的強度の時間変化は所望の楽音変化をもたらす。
【００４０】
図１７には、演算上の重要な利点をもたらす変更例が示されている。一般的に、楽音の初期的なアタック部は、意義ある高周波数情報を含んでいる。通常のフィルタ付きの遅延ループにおいて前記アタック部を適切に合成するためには、前記ループフィルタのサンプリングレートは、比較的高いレートに維持されなければならない。これは、前記合成楽音における高周波成分がより少ないその他の部分には当てはまらない。
【００４１】
図１７に示すように、この発明は、励振信号の１つとして別個のアタック信号を供給し、フィルタ付きの遅延ループの周囲に迂回させることによって、演算上の要件を軽減するものである。前記アタック信号は、トリガ信号に応答して他の励振信号と並列的に読み出される継続時間が短い（例えば、100msの）高周波信号を含む。図１７において、前記アタック信号は符号８２の箇所において供給され、増幅器８４によってゲイン制御され、出力合算ジャンクション８６に与えられる。付加的な励振テーブル８８は、符号９０の箇所において適当に重み付けされ、９２において合算されることによって、合成励振信号ａ（ｎ）を提供する。この合成励振信号ａ（ｎ）は、遅延ライン９４、ループフィルタ９６および加算器９８を含むフィルタ付き遅延ループに入力される。
【００４２】
高周波成分を処理する必要がないので、前記ループフィルタ９６におけるサンプリングレートは大変低いレートでよい。例えば、ギターの低いＥ音のような低音高の音を低コストで発生する場合において、弦ループに入力される励振信号は1.5kHzに制限されてよく、1.5kHzでハイパスされる記録された音の最初の100msecが前記アタック信号に使用されてよい。また、3kHzのサンプリングレートが前記遅延ループに使用されてよい。前記ループの出力信号は、補間回路１００によって22kHzにアップサンプルされ、同様に22kHzのサンプリングレートで供給される前記アタック信号に加算されてよい。前記合成励振信号ｚ（ｎ）は所望の高周波成分および低周波成分の両方を含むが、それにも関わらず、前記遅延ループの処理は大幅に簡略化される。前記弦ループのサンプリングレートは、音高の関数として制御されてよい。
【００４３】
この発明の合成技術は、少数の指数関数的減衰共振モードを有するビブラホン、および、タムタム、マリンバ、鉄琴などその他の打楽器の楽音合成にも適用可能である。これらの場合、複数のフィルタ付き遅延ループの出力を合算し、これにより、一連の略調和振動のモードの合算値として最も重要な共振モードを模することができる。この技術は、吹奏楽器にも適用可能である。この場合の励振テーブルは、前記吹奏楽器の管の内部からのインパルス応答を、音孔および朝顔部分の外部に供給する。楽音波形と励振信号との間の相互作用を与える（典型的には、吹奏楽器の物理的シミュレーションに使用される）非リニア性のジャンクションが存在しないので、自然なアーティキュレーションを得るのは難しい。しかし、この技術は、簡単に且つ低コストで実施可能である。
【００４４】
図１８〜図３０には、この発明の他の実施の形態が示されている。この実施の形態は、共振部を“ダンプ”モードと“リンギィ”モードとに分けることによって、励振テーブルのサイズを小さくでき、従って、コストを軽減できるものである。この場合、最も少なくダンプされた共振部分が抽出され、残存するより多くダンプされた共振部分のみが弦との間で配置順序の入れ替えが行われる。
【００４５】
図７および図８に関して上述したように、最も簡単な例では、集合励振テーブル５２は、基本的に、選択された共振部（例えば、ギター本体）のサンプル化されたインパルス応答になる。より複雑な例において、集合励振信号は、共振部のインパルス応答が励振信号ｅ（ｎ）によって畳み込まれる前記数式（１）に示した畳み込みを実行することによって与えられる。
【００４６】
前記畳み込み結果の長さ、従って、前記テーブルに格納された集合励振信号に影響を与える前記共振部のインパルス応答ｒ（ｎ）の長さは、その最も少なくダンプされた共振によって決定されるようにしている。この発明者は、より多くダンプされた共振から、最も少なくダンプされた共振、すなわち、長く鳴響く（すなわちリングする）モード（すなわちリンギィモード）を除去することによって、弦との間で配置順序が入れ替えられる前記共振部分は、より多くダンプされた部分のみを有することになる、ことを発見した。この共振部分は、より短いインパルス応答を有する。
【００４７】
配置順序が入れ替えられない長くリングする部分は、少数の２極フィルタ部またはその他の巡回型フィルタ構造によってシミュレートされることができる。なお、この発明は、決して、ディジタルフィルタによる実施に限定されるものではなく、任意の適当なディジタルフィルタまたはアナログフィルタを使用するものであってもよい。今日のほとんどのシンセサイザは、発生された楽音信号について処理後の効果を付与するために多数の“付加的な”フィルタを使用しているので、この発明は、前記共振部の“リンギィ”部分として作用するこれらのフィルタを利用することによって、前記集合励振信号を格納するために必要な励振テーブルを大幅に簡略化できる。
【００４８】
図１８の（Ａ）には、例えばギターの楽音発生メカニズムがブロック図で示されており、この例において、トリガ信号が励振源３０に供給されると、該励振源３０は、弦部３２を励振するための励振信号ｅ（ｎ）を発生する。前記弦部３２は、最終出力信号ｘ（ｎ）を発生する共振部３４を励振する出力信号ｓ（ｎ）を発生する。前記共振部３４の特性は、図４〜図６に関して上述したものと同じである。
【００４９】
これまで説明した実施の形態のように共振部３４と弦３２とを即時に入れ替える代りに、前記共振部３４の特性は、図１８の（Ｂ）に示すように、先ず、“ダンプ”共振部１０２と“リンギィ”共振部１０４とに分けられる。典型的には、前記共振部３４は、先ず、測定されたインパルス応答の形で研究される。この測定されたインパルス応答は、例えば、フォースハンマー、２チャンネルのＡ／Ｄ変換、および、当業者に知られているMatLab（商標）プログラミング環境に利用可能なシステム同定ソフトウエアを使用することによって得てよい。
【００５０】
前記２つのＡ／Ｄ変換チャンネルの一方は、フォースハンマーによる打力に比例したフォースハンマー出力を記録するものである。また、他方のＡ／Ｄ変換チャンネルは、例えば、前記ハンマーによる操作に対する前記共振部の応答を測定するマイクロホン出力を記録するものである。前記システム同定ソフトウエアは、基本的に、測定されたマイクロホン“出力”信号の中からフォースハンマー“入力”信号を逆畳み込み（デコンボルブ：deconvolve）することによって、測定されたインパルス応答を推定する。
【００５１】
前記逆畳み込み（デコンボルブ）機能を実現するための単純な技術は、前記“入力”のフーリエ変換によって前記“出力”のフーリエ変換を分割することによって、前記共振部の測定された周波数応答を得ることである。代案として、市販のソフトウエアパッケージを使用して、より高度のデコンボルブ処理を行ってもよい。いずれかの上記技術を使用し、周波数応答の逆フーリエ変換を行うことによって、前記インパルス応答が得られる。
【００５２】
前記共振部３４のインパルス応答が決定された後、前記インパルス応答の最もリングするモードが“パラメトリックな形態”に変換される。すなわち、前記共振部の周波数応答における最も狭い“ピーク”の各々に対応する精確な共振周波数および共振帯域幅が、確認され、“リンギィ部分”１０４に移される。最長のリンギィモードは、最も狭い帯域幅に対応する。また、この最長のリンギィモードは、典型的には、前記周波数応答における最高のピークを含む。
【００５３】
従って、最長のリング時間を有する共振を測定するための効果的な技術は、前記共振部３４の測定された周波数応答における最も狭く、最も高いスペクトルピークの精確な位置および帯域幅を求めることである。狭い周波数応答ピークの中心周波数および帯域幅は、前記共振部のリンギィ部分１０４における２つの極を決定する。フィルタをその極および零点に関して表すことは、インパルス応答または周波数応答のような“非パラメトリック”表現とは異なり、１種の“パラメトリック”なフィルタ表記である。
【００５４】
当業者に知られているように、測定された周波数応答ピークをパラメトリックな形態に変換するためのソフトウエアを含む市販のシステム同定ソフトウエア製品が利用可能である。このような市販のシステム同定ソフトウエア製品は、フォースハンマーと完全なデータ収集手段とを含むものである。さらに、当業者は、この問題について書いた信号処理文献に精通している。一例として、“Prony's method”は、指数関数的に減衰するシヌソイド（２極共振部のインパルス応答）の和についての周波数および帯域幅を推定するための古典的な技術である。より高度な最近の技術は、“matrix pencil method（マトリックスペン方式）”と呼ばれている。
【００５５】
図１９は、小規模なMatlabプログラムを使用して実行された、前記共振部３４のリンギィ部分１０４をパラメトリック形態に変換する方法を図示するものである。図示を簡略化するため、この図示例にあっては、１つの周波数応答ピークのみが示されている。先ず、デシベルによるスペクトル規模に基づいて動作する二次補間用ピークファインダを使用して、前記ピークの中心周波数が測定される。次に、前記測定されたデータをできるだけ忠実に模する周波数応答を有する２極フィルタを設計するために、汎用フィルタ設計関数“invfregz ( )”が呼出される。
【００５６】
（図１９の例で実行されたように）パラメトリックなフィルタ係数を求めるには、ディジタルフィルタを設計するための既知の“equation−error method（エクエーションエラー方式）”を使用可能である。前記フィルタ設計プログラムがスペクトルピークに焦点を当てるよう、同じく図１９に示したように、（図上にオーバーレイされるよう再正規化された後）重み付け関数が使用される。この例に使用される重み付け関数は、0Hzから900Hzの範囲では“1”であり、900Hzから1100Hzの範囲では“100”であり、その後“1”に戻る。図１９の重み付け関数は、1000Hzのスペクトルピークを中心とする矩形関数として現れる。
【００５７】
さらに、図１９は、前記エクエーションエラー方式によって設計された２極フィルタの大きさ−周波数−応答のオーバーレイを示している。図示のように、非パラメトリックな周波数応答とパラメトリックな周波数応答との間の適合度は、前記ピーク近くにおいて極めて高い。初期に測定された補間済ピーク周波数を使用することによって、所望のフィルタの極角度を細かく調整でき、こうして、前記エクエーションエラー方式をこの場合のピーク帯域幅のみを測定するための技術とすることができる。当業者に知られているように、信号処理分野には、スペクトルピークを測定する多数の技術が存在しており、この発明は、図示された技術に使用される場合に限定されるものではない。
【００５８】
前記共振部３４のリンギィ部分１０４をパラメトリック形態に変換する他の方法は、前記共振部の極を求めるために、周知の線形予測符号化（ＬＰＣ）技術を使用した後に多項因数分解を行うものである。前記ＬＰＣは、スペクトルピークを模するには特に優れている。ｚ平面の単位円に最も近い極が、前記共振部３４のリンギィ部分１０４について選択可能である。
【００５９】
前記リンギィ部分１０４を実現するために前記ＬＰＣまたはその他任意の“最小位相”のパラメトリック形態を使用する場合、これに対応する“ダンプ”部分１０２は、特に線形予測符号化およびシステム識別に関して周知の処理である“逆フィルタ処理”と呼ばれている処理を利用することによって、前記共振部３４の完全インパルス応答、および、パラメトリックなすなわちリンギィ部分１０４から算出可能である。
【００６０】
前記逆フィルタは、その零点が前記リンギィ部分１０４の極に等しいオールゼロフィルタを用意することによって構成される。前記リンギィ部分１０４が零点を有するものである場合、これらの零点は、前記逆方向フィルタの極となるので安定しなければならない。ディジタルフィルタの場合、前記零点は、ｚ平面において１未満の大きさを有する必要がある。また、アナログフィルタの場合、前記零点はｓ平面の左半分に位置しなければならない。このようなフィルタは、“最小位相”フィルタと呼ばれている。前記リンギィ部分１０４の推定されたパラメトリック形態において最小ではない位相が得られる可能性を小さくするには、既知のケプトスラム“畳み込み”技術のような非パラメトリック方式を使用して、共振部の初期のインパルス応答を最小位相インパルス応答に変換することが有用である。
【００６１】
ディジタルフィルタおよびアナログフィルタのいずれの場合においても、零点が非最小位相である場合、同じ周波数応答の大きさを持つ最小位相フィルタを実現できるよう、これらの零点は適当な周波数軸の周りに反映されなければならない。こうして、前記逆フィルタは、“残差”信号を得るために、前記共振部の完全インパルス応答に適用される。この残差信号は、前記“ダンプ部分”１０２のインパルス応答であり、前記弦との配置順序入れ替え、および、“爪弾き音”信号のような弦励振信号との畳み込みに適している。前記残差信号が前記リンギィ部分１０４すなわちパラメトリック共振部（この場合、最小位相フィルタ）に供給される場合、前記共振部３４のオリジナルのインパルス応答を高い精度で実現できる。一般的に、この場合の精度は、単に、逆方向および順方向のフィルタ演算の間に発生する数値丸め誤差によって影響される。
【００６２】
この発明者は、オールポールフィルタが便利で操作が容易であると判断した。オールポールフィルタは常に最小位相であり、前記ＬＰＣ技術はこれらを容易に算出する。当業者が理解するように、所定数の極および零点を有するパラメトリック部分を発生可能なフィルタ設計技術は多く存在しており、重み付け関数を使用して、前記方法をインパルス応答３４の最長にリングする成分に導くことができる。図１９に図示したエクエーションエラー方式は、極のみならずパラメトリックすなわちリンギィ部分の零点をも算出可能な方法の一例である。こうして、前記パラメトリック部分１０４は、任意数の極および零点を有してよく、任意の既知のフィルタ実施技術を使用して実施してよい。
【００６３】
既知のディジタルフィルタ実施技術は、二次フィルタ部分の直列接続および並列接続を含むものである。当該技術分野において知られているように、任意のリニアの時間変化型（ＬＴＩ）フィルタの伝達関数は、初歩的な二次部分の直列接続に組み込み可能である。また、同様に当該技術分野において知られているように、すべてのＬＴＩフィルタは、“部分分数拡張”演算によって並列的な二次部分の和に分けられる。各前記二次部分は、１つの周波数のみにおいて共振可能であるか、または、全く共振できないものである。
【００６４】
図２０には、一般的な二次フィルタ部の“Direct Form Ｉ”と呼ばれている実施形態（２つまでの極、２つまでの零点および１つのゲイン係数）が示されている。他の実施形態はいくつかあるが、このDirect Form Ｉは、すべての乗算器の出力が典型的には２の補数の演算を使用する共通の加算器に送られるので、数値的には好ましい選択である。その結果、その結果、オバーフローは出力の１つの桁のみに生じる。最高の品質を得るために、フィードバック信号（図において、これらのすべてはスケーリング係数ｂ0〜ｂ2の右方向）は２倍の精度で実施可能である。これに応じて励振テーブルがスケーリングされる場合、図２０の係数ｂ0は除去可能である。図において、
【数２】

は、１単位（１サンプリング周期）の遅延を実現する、単位遅延要素である。
【００６５】
二次フィルタ部が共振するものである場合、その共振周波数および共振帯域幅は、次の式に従って、フィードバック係数ａ1、ａ2によって決定される。
ａ1 = -2 R cos(2 Pi Fr / Fs ) (2)
ａ2 = R2 (3)
ここで、Frは共振周波数をサイクル／秒すなわちヘルツ（Hz）で表した値であり、FsはディジタルオーディオサンプリングレートをHzで表した値であり、Piは3.141...であり、Rは次の数式によって共振帯域幅に関連付けられる極半径である。
R = exp(-Pi Br / Fs) (4)
【００６６】
二次共振部に関する減衰Trの時定数は、
Tr = 1 / (Pi Br) (5)
によって、帯域幅に関連付けられる。
【００６７】
前記時定数は、前記共振部のインパルス応答が、1/e = exp(-1) で表せる率で減衰する時間（秒）として定義される。この実施の形態において、“最もリングする”二次部、すなわち、最長の減衰時間Tr（または、最小の帯域幅Br）を有する部分を識別する必要がある。これらの部分は、明確に、前記本体共振部３４のパラメトリック部分１０４の二次部として実施してよい。
【００６８】
図２１は、２つの極のみを有する上記ほど一般的ではない二次共振部を示す図である。前記パラメトリック部分１０４がオールポール型として選択され、二次部が直列接続される場合に、この形態を使用できる。
また、図２２は、２つの２極部の直列接続を示す図である。おそらくこれは最も便利な選択であろうが、この発明者は、これが２つの二次部が並列接続された図２３の例ほど数値的な性能において優れたものではない、ということを発見した。当業者によって認識されるように、適当な伝達関数の部分分数拡張は、各々が多くても１つの零点と２つまでの極とを有する並列的な二次部をもたらす。
【００６９】
一般的に、ディジタルフィルタが互いに素な共振を発生する（すなわち、共振が周波数領域において重なり合わない）場合、直列の二次部より並列の二次部を使用することが数値的に好ましい。これは、直列の二次部を使用した場合、ある周波数で共振ピークを得るためには、共振していない他のすべてのフィルタ部によって信号減衰を補償することも必要である、ということを考慮することによって理解することができる。一方、並列の二次部を使用した場合には、共振している二次部が基本的に単独で共振動作を行う。このように、概して、図２３に示した並列の二次部は、図２２に示した直列の二次部より数値的に優れている。しかし、並列の二次部は、直列の二次部に比べて算出するのに便利ではなく、各々の出力の位相を適切に合せるためには、各二次部ごとに零点を必要とする。
【００７０】
ほとんどの市販されている楽音シンセサイザに設けられた効果プロセッサは、普通、“パラメトリック・イコライザ部”を備えている。典型的には、各前記パラメトリック・イコライザ部は、ｂ0、ｂ1、ｂ2が１つのゲイン制御を実現するよう制限されている図２０に示したような二次共振部である。通常、前記イコライザ部のパラメータは、各イコライザ部ごとの中心周波数、帯域幅およびゲインである。このため、普通、合成された楽音における様々な周波数帯域の混合を調整するために使用されるパラメトリック・イコライザ部は、所望の本体共振部のリンギィモードを実現するためにも使用可能である。
【００７１】
前記共振部の分解が完了し、パラメトリック部すなわちリンギィモード部１０４が実現されると、上記実施の形態において図１８の（Ｃ）の如くなされたように、前記共振部３４の最も多くダンプされた共振が弦３２と配置替えされる。次に、図１８の（Ｄ）に示したように集合励振信号１０６を発生するために、このダンプ共振部１０２は、例えば上記数式（１）を使用して、励振信号３０と畳み込まれる。
【００７２】
このように、この実施の形態にあっては、トリガ信号が集合励振信号１０６に与えられ、該集合励振信号１０６は、励振信号ａ（ｎ）を介して弦３２を励振する。こうして、前記弦３２は、入力信号を処理し、前記共振部３４の長くリングする成分を含まない出力信号ｒ（ｎ）を発生する。一方、これらの長くリングする成分は、弦３２の出力側において、合成中の信号の性質に応じて直列または並列接続された、例えば多数の２極フィルタ部である共振フィルタで構成されていてよいリンギィモード共振部１０４を介して供給される。その結果として該共振部１０４から出力される信号は、楽音信号となる。
【００７３】
パラメトリックな共振部すなわちリンギィモード共振部１０４を別途有することによる付加的な利点は、分解されていない共振部３４ではただ１つの出力信号のみが容易に利用可能であるのに対して、複数の出力信号が利用可能になる、ということである。これら複数の出力信号は、合成される楽音の品質を様々に向上させるために使用可能である。一例として、前記並列接続された二次共振部の出力は、様々異なる位置にステレオ的に“パン（pan）”されることができる。このパンニングは、シミュレートされる楽器の共振モードの空間的な分布を模するよう選択可能である。このステレオ配置をわずかに変更することによって、空間的に移動する楽器をシミュレートできる。
【００７４】
前記パラメトリックな共振部すなわちリンギィモード共振部１０４の個々の共振部の様々なステレオ配置を実現するために、図２３に示した２つの共振部出力を取り込む１つの加算器は、一方が“左チャンネル”用、他方が“右チャンネル”用である２つの加算器によって置き換えられる。また、各前記共振部ごとに、例えば乗算器である２つのスケーリング手段が使用され、一方のスケーリング手段は左チャンネルに合算される前の出力信号をスケーリングし、他方のスケーリング手段は右チャンネルに合算される前の出力信号をスケーリングする。それぞれのスケーリング係数を調節することによって、前記左チャンネルおよび右チャンネルに送られる前記出力信号の量が、前記信号のステレオ配置を決定することになる。
【００７５】
上記のように２つのスケーリング手段が使用される場合、各共振部の“ｂ”係数（例えば、図２３のｂ0ａおよびｂ0ｂ）のうちの１つを省くことができる。このため、ステレオ配置にためには、１つの付加的な乗算器を必要とするだけですむ。また、前記ステレオ配置の角度は、しばしば、あまり決定的な要素ではないので、前記２つのスケーリング係数は、0，1/8, 1/4, 3/8, 1/2, 5/8, 3/4, 7/8, 1という値のみを想定できる数のような、乗算を必要としない特別にクオンタイズされた数であってよい。例えば、これらの数による乗算は、１回または２回のシフトおよび零点または１つの固定小数点加算もしくは減算を使用して（二値の固定小数点で）実行可能である。
【００７６】
この実施の形態に係る技術を使用して実現可能な励振テーブルのサイズ縮小は、ダンプ量が最も小さい１つのモードが除外される前後における理想的なギター本端の合成インパルス応答を観察することによって例証されることができる。
図２４は、100Hzで共振するシミュレートされたギター本体の初期インパルス応答を示す図である。既知のように、ギターにおいては、一般的に、ギター本体の主な共振は最長のリンギング共振を提供し、故に、図２５に示すような前記本体のインパルス応答のダンプ量が最も少ないリンギング要素を発生する。図２６かに分るように、この単一の、二次の、ダンプ量が最も少ない、100Hzの共振成分を除去することによって、前記励振テーブルを１桁短縮化できる。この例において、前記除去される成分は、例えば、100Hzの共振周波数を有する図２１に示した単一の二次共振フィルタ１０４によって作り出すことができる。
【００７７】
図２７〜図２９は、実際の古典ギターから測定されたデータを使用する上記と同様な例を示す図である。図２７は、既知のケプトスラム“畳み込み”技術を使用して最小位相に変換された共振部３４の推定インパルス応答を示す。この場合、一方が110Hz、他方が220Hzである２つの長くリングする低周波共振が存在する。各々が周波数応答におけるスペクトルピークを発生する２つのリング共振が存在するので、パラメトリック共振部１０４は、少なくとも４つの極を有する必要がある。前記エクエーションエラー方式を使用して算出されたこの４極のパラメトリック共振部１０４のインパルス応答は、図２８に示されている。逆方向フィルタ処理が行われたが、図２９には、残差インパルス応答１０２が示されている。約12msecのインターバルで現れる小さなノイズバーストは、図示された測定を行うために励振去れたギター弦のピッチに関連したものであり、この例には関係ない。
【００７８】
この実施の形態に従って実現可能な励振テーブルのサイズ縮小は、この楽音シンセサイザ全体のコストを相当節約する。さらに、この実施の形態は比較的簡単な共振フィルタを使用して“リンギィ”モード共振部を実現し、且つ、前記フィルタが現在製造中のほとんどのシンセサイザに既に存在するものであるので、付加的なコストが必要でない。
【００７９】
前記共振部３４のインパルス応答の最長リンギング共振モードをパラメトリック部１０４に抽出することによって実現されるコスト節約は、とりわけ、（１）前記インパルス応答の継続時間、（２）前記最長リンギィ共振モードを抽出した後に残る前記インパルス応答の継続時間、（３）メモリのコスト、および、（４）二次フィルタ部を実現するコストによって左右される。現在のハードウエアの傾向によれば、ローカルには少量のメモリのみが利用可能な次第によりコンパクト化している構成において、より高速の処理を行うプロセッサが提供されている。より多くのプロセッサ利用を犠牲にしたメモリ使用量の減少は、歓迎すべきトレードオフ（引き換え）である。
【００８０】
このようにして、この実施の形態は、上記第１の実施の形態の全体的な効果と同等の効果、すなわち、従来必要であった高価で複雑な本体フィルタを不要にできる。しかし、この実施の形態において、共振部をそのダンプ成分に基づいた成分に抽出することによって、共振フィルタを使用して簡単に模することができる共振部を維持可能であり、かなりより複雑なダンプ部を、下流側の共振特性を提供する集合励振信号を作り出す励振と畳み込み可能である。
【００８１】
さらに、この技術は、既存の共振フィルタを使用してリンギィモード共振部を実現することによってシンセサイザの能力を利用しながら、サイズが大きな共振部の大部分を除去し、励振テーブルのサイズを小さくすることによって、楽音合成処理を簡略化することができる。この技術は、図１５〜図１７に示したような複数の励振テーブルの使用を含む他の実施の形態と共に使用可能である。
【００８２】
図１８〜図２９に関して上述した実施の形態は図３、図１２等に示した上記実施の形態の遅延ループと共に使用されることもできるが、図３０には、自己フランジ弦すなわち仮想的にデチューンされた弦をシミュレート可能な改良型のフィルタ付き遅延ループが示されている。図３０に示すように、入力信号（例えば、ａ（ｎ））は、加算器１０８および１周期遅延要素１１０に与えられる。
【００８３】
前記遅延要素１１０は２つの出力を発生する。第１の出力は、移動式の被補間タップ１１２、すなわち、前記遅延要素１１０のラインに沿って連続的に変化する位置から取り出されることによって該取り出しポイントに比例した量だけ遅延された出力である。この出力は、時間変化する値であってよいスケーリング係数ｇによってスケーリングされることができる。前記遅延要素１１０の出力は加算器１１４にも与えられ、こうして、前記加算器１１４は、ローパスフィルタ１１６およびその次の加算器１０８にフィードバックする。前記移動式の被補間タップ１１２の出力は、前記加算器１１４の下流側であって前記遅延ループの外部に設けられた加算器１１８に与えられる。
【００８４】
上記構成によって、弦シンセサイザでは高いコストを要する効果を奏するいくつかの特徴が実現される。先ず、低速で前後に移動式の被補間タップを使用してフランジ弦が実現可能である。理想的には、多数の独立に移動するタップは、（例えば、図３０において破線で示すように）最良のフランジ効果を奏する。各前記タップは、前記出力に対して移動式の櫛形フィルタ処理を付加する。
単一の非移動式のタップは、前記弦に沿う爪弾き、叩き弾きまたはその他の励振操作の位置をシミュレートするために必要な固定された櫛形フィルタ処理を実現することができる。この場合、前記物理的な励振の正確な位置は十分には聞き取れないので、非移動式のタップは補間を必要としない。
【００８５】
フランジ弦のシミュレーションの他に、より高速の一方向性のタップを使用することによって、デチューンされた“第２の弦”をシミュレートできる。この場合、前記タップの速度はドップラー偏移に対応する。前記移動式のタップが図３０の右側に移動する速度が速ければ速いほど、前記入力信号中のすべての周波数のドップラー偏移が遅くなる。一方、前記移動式のタップが図３０の左側に移動する速度が速ければ速いほど、前記周波数のドップラー偏移が速くなる。この実施の形態において、前記移動式のタップが前記遅延ラインの端に到達すると、該タップは、なんらかの方法で、他方側に“循環”する必要がある。
【００８６】
最も簡単な例において、簡単な循環方法が使用できる。前記遅延ラインの入口側端部に到達した一方のタップの出力がフェードインすると同時に、前記遅延ラインの出口側端部に到達した他方のタップの出力が零点にフェードするようクロスフェードすることによって、より良好な音が得られる。かくして、この場合、クロスフェードの間、２つの移動式の非補間タップがアクティブ状態になる。より手のこんだ循環は、前記遅延ラインに沿う適当なジャンプ箇所を捜すことである。例えば、可能な場合、前記回り込みはゼロクロス点において行われるようしてもよいし、様々な遅延ポイントでの相関作用を算出してもよい。これらの技術のすべては、“ハーモナイザ”および“ピッチシフト”アルゴリズムとの関係において、ある程度知られた技術である。異なるタップ速度のタップを付加することによって、その他のデチューンされた弦をシミュレートすることができる。このようにして時間変化するわずかに異なるチューニングの、多数の仮想的な弦を作り出すことによって、快い“コーラス効果”が得られる。
【００８７】
フランジングおよびドップラー偏移を使用して、振動弦のカプラー効果を模することができる。このようなカプリングは、リングする音の倍音の振幅エンベロープにおいて、ゆっくりしたうなりを発生する。（あまり深くないノッチを有する）移動式の櫛形フィルタは、フランジングによって、質的に同様な効果を奏することができる。代案として、弦の出力を仮想的にデチューンされた弦と加算することによって、同等な効果が達成される。特定の例として、図３０のスケーリング係数ｇを0.25に設定し、前記タップ速度を0.25%のドップラー周波数偏移を発生するよう設定すると、わずかにミス同調されている２つの弦の間の和は、電気ギターに見られるものと同様なうなりを発生する。
【００８８】
ほとんどすべての弦楽器音合成において必要とされる複数弦のカプラー音のシミュレーションは、この発明を使用する場合、コスト効率が良くなる。従来においては、同様な複数弦のカプラー音をシミュレートするためには、複数の弦シミュレータが必要であった。しかし、この発明では、ただ１つの弦シミュレータを使用して、複数弦のカプラー効果は１つまたは複数の移動式タップによって付加されるようになっている。
【００８９】
上記の弦カプラーのシミュレーションの変更例として、各々が図３１に示す弦をシミュレートするものである２つのフィルタ付き遅延ループを接続したものを用いてもよい。この変更例において、各フィルタ付き遅延ループの出力が合算され、その合算信号が、好ましくは約0.01以下の大きさを有する負の係数を使用してスケーリングされる。図３１に示すように、より精確なシミュレーションを行うために、前記負の係数は、フィルタによって、測定されたまたは理論的に予測された接続特性から算出可能な伝達関数-Hb(z)と置き換えられる。このようにスケーリングされたまたはフィルタされた信号は、フィードバック路を介して各前記フィルタ付き遅延ループにフィードバック加算され、好ましくは、前記出力が前記ループから取り出された位置の直後の位置において前記ループに導入される。カプラー音のために使用される出力は、前記ループの任意の位置から取り出されてよい。
【００９０】
このような真正のカプラー法は、Ｎ個の弦につき、次のように行われる。（Ｎ個の弦に対応する）Ｎ個のフィルタ付き遅延ループの出力が合算され、この合算された信号が「−イプシロン」によってスケーリングされ（または、-Hb(z)によってフィルタされ）、このスケーリングされたまたはフィルタされた信号は、好ましくは、フィードバック路を介して各前記ループに導入される。前記スケーリングされた（またはフィルタされた）信号は、“ブリッジ”の出力の物理的な解釈を表す。前記スケーリングされた信号、または、ほとんどの場合にはスケーリング前の合算信号は、接続された弦のアセンブリの集合出力に関する優れた選択を提供する。
【００９１】
真正カプラーを実施するために負の係数が使用される場合とは異なり、-Hb(z)のフィルタが使用される場合には、相互に接続された（カップルされた）すべてのフィルタ付き遅延ループにおけるループフィルタを除去することができる。すなわち、接続されたすべてのフィルタ付き遅延ループに必要なすべてのフィルタ処理を提供するために、カプリング・フィルタ-Hb(z)を使用することができる。個別のループフィルタが使用されない場合、前記カプリング・フィルタは共用ループフィルタとみなすことができる。
【００９２】
さらに、これらの効果（すなわち、フランジ、デチューンされた弦、コーラス、仮想カプラー、真正カプラーなど）は、フィルタ付き遅延ループを使用するいずれの合成技術にも利用できる。その例としては、吹奏楽器、金管楽器および打楽器のウエーブガイド合成がある。
【００９３】
要約すると、この発明に従うと、トリガされた（任意に処理されていてもよい）インパルス応答に対応する励振信号を供給することによって、高価で複雑な本体フィルタを必要とすることなく、高品質の“爪弾かれた”、“叩き弾きされた”および“弓で弾かれた”楽音を合成できる。弦のような振動要素の下流側に設けられた共振システムの特性は、該下流側の共振システムのインパルス応答を考慮した励振信号を適切に発生することによって提供される。
【００９４】
この発明に係る楽音合成技術は、複雑な本体フィルタを必要とする従来のシステムに比べて、その構成が大幅に簡略化される。さらに、共振部をダンプモードとリンギィモードとに分解し、しかる後、前記ダンプモードを弦と入れ替えて励振信号と畳み込むことによって、従来使用されていた複雑で高価な本体フィルタを除去できると共に、励振テーブルのサイズを小さくでき、該励振テーブルに関わるコストを軽減できる。フランジ効果およびコーラス効果、ならびに、仮想的にデチューンされた弦が、フィルタ付き遅延ループの遅延ラインに沿って移動し、出力と合算される被補間タップを付加するだけで、実現可能になる。
【００９５】
【発明の効果】
以上のように、この発明は、簡単な構成且つ低コストで高品質の楽音を合成できる、という優れた効果を奏する。
【図面の簡単な説明】
【図１】本体フィルタを使用した従来のフィルタ付きの遅延ループに基づく楽音合成システムを示すブロック図。
【図２】フィルタ付きの遅延ループと本体フィルタとの配置順序が入れ替えられた楽音合成システムを示すブロック図。
【図３】本体フィルタのインパルス応答に対応する集合励振信号が与えられる、この発明の一実施の形態に従う楽音合成システムを示すブロック図。
【図４】ギターの楽音発生メカニズムを示すブロック図。
【図５】ギターおよび周囲空間による楽音発生を示すブロック図。
【図６】前記周囲空間を含むピアノの楽音発生メカニズムを示すブロック図。
【図７】弦の前に共振部が配置された等価の楽音発生メカニズムを示すブロック図。
【図８】特定の励振に対する前記共振部の応答が集合励振信号として使用される楽音発生メカニズムを示すブロック図。
【図９】励振信号を設定する逆フィルタ方式を示すブロック図。
【図１０】本体フィルタのインパルス応答に対応する励振信号の一例を示す図。
【図１１】持続した楽音発生を実現するために反復的に供給される励振信号の一例を示す図。
【図１２】持続した楽音発生を実現するために反復的に供給される励振信号の他の例を示す図。
【図１３】ピックの位置の変化をシミュレート可能な楽音発生システムを示すブロック図。
【図１４】ピックの位置を変化させるための等価システムを示すブロック図。
【図１５】最終的な励振信号を発生するためにスケーリングされて、加算される２つの励振テーブルを使用したシステムを示すブロック図。
【図１６】時間変化する混合された励振信号発生器を組込んだ楽音合成システムを示すブロック図。
【図１７】遅延ループの外部でアタック成分を発生するための励振信号発生器を組込んだ楽音合成システムを示すブロック図。
【図１８】ギターの楽音発生メカニズムを示すブロック図であって、（Ａ）は該楽音発生メカニズムの一般例を示し、（Ｂ）は共振部がダンプモード部とリンギィモード部とに分解された例を示し、（Ｃ）は前記ダンプモード部が弦の前に設けられた例を示し、（Ｄ）は励振信号が前記ダンプモード部と畳み込まれた例を示す図。
【図１９】非パラメトリックな周波数応答、パラメトリックな周波数応答フィットおよび重み付け関数のオーバーレイを示す図。
【図２０】ディジタルの共振部を実現するための構成例を示す図。
【図２１】ディジタルの共振部を実現するための他の構成例を示す図。
【図２２】ディジタルの共振部を実現するための他の構成例を示す図。
【図２３】ディジタルの共振部を実現するための他の構成例を示す図。
【図２４】 100Hzで共振するギター本体のシミュレートされたインパルス応答を示す図。
【図２５】図２４に示したインパルス応答の最長のリンギングモード（最も少なくダンプされた成分）を示す図。
【図２６】図２５の最も少なくダンプされた成分が除去された図２４に示した初期インパルス応答を示す図。
【図２７】実際のギターから測定されたデータから算出されたギター本体のインパルス応答を示す図。
【図２８】図２７に示したインパルス応答における２つの最長リンギングモードのパラメトリック推定値を示す図。
【図２９】逆フィルタ処理を使用して図２８のパラメトリック成分が除去された図２７のインパルス応答を示す図。
【図３０】自己フランジ弦または仮想的にデチューンされた弦を合成するために、フィルタ付きの遅延ループが移動式の補間タップによって構成された楽音合成システムを示すブロック図。
【図３１】弦カプラーのシミュレーション技術の一実施の形態を示すブロック図。
【符号の説明】
２０テーブル
２６遅延ライン
２８ループフィルタ
３２弦
３４共振部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a musical tone synthesis technique, and more particularly to a musical tone synthesis technique known as “physical-modeling synthesis” for synthesizing musical sounds according to the mechanism of a natural musical instrument.
[0002]
[Prior art]
Today, tone synthesis based on physical models is commonly used alongside current mainstream tone synthesis methods such as “sampling” (or “waveform table”) synthesis and FM synthesis. Musical tone synthesis based on such a physical model is particularly useful for simulation of wind instruments and stringed instruments. By accurately simulating the physical phenomenon of musical sound generation in natural musical instruments, electronic musical instruments can generate high-quality musical sounds.
[0003]
In the case of stringed instruments, the structure for synthesizing musical sounds is typically included in a delay loop with a filter, that is, a closed loop that realizes a delay of a length corresponding to one period of the musical sound to be generated, and a closed loop. And a filter. An excitation signal is input to the closed loop and circulates in the closed loop. Thus, the closed-loop output signal can be extracted as a musical sound signal. This signal is attenuated according to the characteristics of the filter. The filter also simulates attenuation at the strings and attenuation at the ends of the strings (eg, guitar nuts and bridges).
[0004]
In an actual stringed instrument, the string is acoustically coupled to a resonator, i.e. a resonance part, and the physical vibration of the string excites the resonance part. Therefore, in order to accurately simulate a natural musical instrument, it is necessary to provide a filter on the output side of a delay loop with a filter. Also, in order to obtain high-quality musical sounds, it was necessary to simulate the output of a string with a large and expensive filter that simulates the instrument body. Generally, the excitation signal is white noise or filtered white noise. As an alternative, a physically accurate “plack” sound waveform may be given as an excitation signal to the closed loop, and in this way, a string nail sound can be simulated more accurately.
[0005]
The conventional musical tone synthesis system described above is shown in FIG. The delay loop with a filter includes a delay element 10 and a low-pass filter 12. An excitation source (for example, an excitation table) 14 provides an excitation signal to the delay loop via an adder 16. The contents of the excitation table can be automatically read from the memory table in response to, for example, a trigger signal generated in response to a key depression. The excitation signal input to the delay loop with the filter circulates through the loop and changes with time according to the operation of the filter 12. In this way, a signal is extracted from the delay loop and applied to the main body filter 18. In order to perform high quality tone synthesis, a complex and expensive body filter (typically a digital filter) or an additional filtered delay loop is required.
[0006]
Although it is more common to implement musical tone generation by software using one or more digital signal processing (DSP) chips, the conventional musical tone synthesis system shown in FIG. It can also be realized by hardware.
[0007]
The conventional musical tone synthesis system can synthesize extremely high-quality musical tone, but requires a complicated and expensive main body filter to simulate the musical instrument main body.
[0008]
[Problems to be solved by the invention]
An object of the present invention is to provide a physical model type musical sound synthesis system that can synthesize high-quality musical sounds easily and at low cost.
[0009]
[Means for Solving the Problems]
The musical tone synthesis system according to the present invention comprises a closed loop connection of an excitation means for generating first and second excitation signals, an input unit for receiving the first excitation signal, and a delay unit for delaying the signal, and A closed loop means including an output unit for extracting the output from the closed loop, wherein the delay amount in the closed loop corresponds to the pitch of the musical sound to be synthesized The sampling rate of the closed loop means is lower than the sampling rate of the second excitation signal, and the sampling rate of the signal extracted from the output section of the closed loop means is the second excitation signal. Means to match the sampling rate of the signal; Taken out from the output section The sampling rate is matched with the second excitation signal. A combining means for combining the signal and the second excitation signal is provided.
[0010]
As a result, a musical sound signal is generated by combining the signal synthesized by the closed loop means based on the first excitation signal and the second excitation signal, so that the configuration of the closed loop means can be simplified.
[0011]
Also , Sampling rate of the closed loop means The Sampling rate of the second excitation signal Lower than The sampling rate of the signal taken from the output of the closed loop means The Sampling rate of the second excitation signal In With means for matching It is characterized by .
[0012]
As a result, even when the sampling rate of the second excitation signal is relatively increased to obtain a high-quality musical tone signal, the subsequent sampling rate is reduced even if the sampling rate of the closed loop means is relatively decreased. Since the tone signal generation with a high sampling rate can be performed without any problem by the matching process, the configuration of the closed loop unit can be simplified and the cost can be reduced by relatively reducing the sampling rate of the closed loop unit. it can.
[0013]
DETAILED DESCRIPTION OF THE INVENTION
An embodiment of the present invention will be described below with reference to the accompanying drawings.
The present invention described below can be realized by any form of hardware including various delay circuits and filters, or software using an appropriate algorithm executed by a DSP, for example. It may be.
[0014]
In a system including a conventional filtered delay loop and body filter as shown in FIG. 1, both the filtered delay loop and body filter are linear and time-varying components. Therefore, the arrangement order of these elements can be reversed. Therefore, an equivalent system can be realized by rearranging the main body filter before the delay loop with the filter (for example, FIG. 2). The output of the excitation signal generator in such a rearranged configuration is directly given to the main body filter. Here, when a string is struck or struck, the output signal of the main body filter, that is, the delay loop is recognized by recognizing that excitation by the string generally takes the form of an impulse. It has been confirmed that the excitation signal given to 1 represents the impulse response of the body filter.
[0015]
In view of this point, in the embodiment described below, this impulse response is measured, and this measured impulse response is stored as a collective excitation signal. In this way, the body filter can be removed and the collective excitation signal is directly applied to the filtered delay loop. Thus, by providing an appropriate excitation signal corresponding to the impulse response of the main body filter, a high-quality musical tone can be synthesized without requiring an expensive main body filter.
In addition, the excitation signal supplied to the closed-loop input is a signal having a component corresponding to the first partial response in the resonant member or resonant system to be modeled according to the present invention. The resonance characteristic according to the second partial response in the resonance system is added to the closed loop output signal by the resonance filter means or the resonance applying means. In this way, since the overall response in the resonance member or resonance system to be modeled is separated into partial responses and shared before and after the closed loop, the configuration of the excitation means and the resonance filter means or the resonance addition means is configured. It can be simplified and its design and manufacturing costs can be reduced.
[0016]
In one embodiment of the present invention, it is possible to remove a complicated and expensive main body filter which has been conventionally required, and to reduce the size of a table required for storing a collective excitation signal. The size of the excitation table can be reduced by decomposing the resonating part into a dump mode and a ringy mode and by setting the collective excitation signal using only the impulse response of the dump mode. Therefore, the memory size required for storing the collective excitation signal can be reduced.
[0017]
The excitation signal generator may be implemented as a single fixed excitation signal or as multiple excitation signals that are combined to form a composite excitation signal. By controllably weighting each of the plurality of excitation signals, a number of different combined excitation signals can be provided. Furthermore, the various excitation signals can be controlled to change with time, so that significant controllability and musical tone changes can be realized despite the use of a fixed set of excitation signals.
[0018]
FIG. 1 shows a conventional filter with a delay unit having a filter including a delay element 10 and a filter 12 and a digital main body filter 18 for simulating a resonator of a natural musical instrument such as a guitar. It is a figure which shows a delay loop. The excitation source 14 supplies an excitation signal to the delay loop. The inventors of the present invention have recognized that the delay loop with filter and the body filter 18 are basically a linear time-varying system. Therefore, it is possible to reverse the order of the delay loop with the filter and the main body filter 18 as shown in FIG. 2 without changing the characteristic of the musical tone generated as a result.
[0019]
That is, in FIG. 2, the main body filter 18 is provided before the delay loop with the filter. Since the overall processing requirements are the same, this sequence change itself does not provide a significant advantage. However, as shown in the prior art, if the variable for simulating a string is selected to be a lateral acceleration wave, the ideal string nail sound is an impulse. In this case, the output of the excitation table for playing the strings is a single non-zero sample, that is, an impulse, which is sandwiched by zero before and after each finger. As a result, it is the impulse response of the body filter 18 that excites the delay loop with the filter. Since the body filter 18 does not change during the generation of one sound, the impulse response of the body is fixed. The present invention focuses on this fact and intends to completely eliminate the necessity of providing a main body filter. Instead of passing the impulse through the body filter, the excitation table is loaded with a collective excitation signal representing the impulse response of the desired body filter. In this way, it is possible to eliminate the need for expensive body filters (or filtering in DSP systems) that were necessary to simulate string connections to the resonating instrument body or other coupling structures. .
[0020]
FIG. 3 is a diagram showing a configuration example of a musical tone synthesis system according to the present invention. In the example shown in FIG. 3, the musical tone synthesis system includes an excitation source including a table 20 that supplies a collective excitation signal e (n) in response to a trigger signal (for example, a key-on signal) 22. The collective excitation signal is given to a delay loop with a filter through an adder 24. The delay loop includes a delay line 26 having a variable length N and a loop filter 28. The output of the delay line 26 is extracted as a tone synthesis output x (n) and returned to the loop filter 28 (multiple outputs can be extracted as is known in the art). The output y (n) of the loop filter 28 is fed back to the adder 24. The length N of the delay line 26 realizes rough pitch control.
[0021]
The loop filter 28 realizes fine pitch control and determines a change in one performance sound. This filter is usually fixed for the duration of a single note, but is damped by the player's hand during the generation of a single note, with a two-stage amplitude envelope decay (eg, piano sound). ), Amplitude envelopes due to coupling with other strings, pseudo reverbation where a small decay amplitude envelope persists after the nominal cutoff time of the sound, and other effects such as time-varying effects Can change to occur. The excitation signal determines the initial spectral components of the musical tone, including the physical excitation at the location where the pick is located and details due to the body filter when simulating guitar nailing.
[0022]
The impulse response (IR) of the loop filter expressed as f (n) (where n is a variable equal to 0, 1, 2,..., Nf−1) is a string in vibration due to curvature or air resistance. Determined by loss and loss due to string coupling to instrument body. The determination of specific loop filter characteristics is known and will not be described in detail here. The impulse response f (n) is obtained by a basic physics equation relating to theoretical loss due to connection or coupling between the string and the body. The string material, tension and diameter can be used to theoretically estimate the loss per unit length of the string. For example, the loss at the connection point of the main body such as a guitar bridge can be estimated from the shape of the bridge and the resonance of the instrument main body. The impulse response f (n) can also be obtained from a physical measurement value of a string of an actual stringed instrument. Furthermore, a combination of estimates based on mathematical formulas and actual physical measurements may be used.
[0023]
Many different methods can be used to set the excitation signal e (n). This excitation signal e (n) is set by both the physical excitation nature of the string and the instrument's response to the excitation point by the string. For example, in the case of a guitar, excitation to the instrument body occurs at the bridge of the guitar. FIG. 4 is a physical block diagram of a guitar. In this figure, an excitation signal 30 is applied to the string 32 so that the string 32 excites a resonance part (guitar body) 34. In a physical system, the resonator is set by selecting an output signal. In a typical example, an output signal is selected at a location several feet away from the surface of the guitar body. In fact, such a signal can be measured by using a microphone held at the desired output point and recording the response at that output point to the operation of hitting the guitar bridge with a force hammer. The resonance portion includes not only the resonance characteristics of the guitar body itself but also the air transmission characteristics. If an output point far from the guitar is selected in the reverberation room, the resonance characteristics of the room in which the measurement is made are also included. FIG. 5 shows the collective characteristics of such a resonance part.
[0024]
The overall resonating section 34 includes a bridge coupling 36, a guitar body 38, an air absorption 40 and a room response 42. In general, it is preferable to select an output that is relatively close to the guitar so that the impulse response of the resonating portion can be as short as possible. However, the universality provided by the ability to incorporate all downstream filtering into a single resonator is an important feature of the present invention. This is more obvious in the case of the piano model of FIG. 6 in which the resonance plate and the enclosure are combined as one resonance part. In this case, the overall resonating section 34 comprises a bridge connection 44, a piano resonance plate 46, a piano enclosure 48 and an air / room response 50.
[0025]
The only technical requirement for the components of the resonator is that these elements have linear and time-varying characteristics. As mentioned above, these two characteristics mean that these elements may be provided in any order of arrangement. When the string also has a linear characteristic and a time change characteristic, the arrangement order of the resonance unit and the string may be reversed as shown in FIG. In fact, the string is the element with the smallest linear characteristic among almost all stringed instruments, but the main effect of its non-linearity is that the fundamental vibration frequency rises slightly with amplitude. Thus, for the purpose of reversing the arrangement order, the strings can be considered to have sufficient linear characteristics. The string has a time-varying characteristic even in the presence of vibrato, which is also a secondary effect. The result of changing the placement order of slowly time-varying strings and resonators is not mathematically the same, but the generated tones sound basically the same.
[0026]
After the arrangement order of the strings and the resonance parts is reversed as shown in FIG. 7, the collective excitation signal 52 is obtained by combining the excitation signal and the resonance part as shown in FIG. 8 as the next step. The collective excitation signal 52 is set so as to provide basically the same output a (n) as the output of the resonance unit shown in FIG. For this purpose, first, the excitation characteristics must be specified. The simplest example is an impulse response. Physically, this would be the most appropriate choice when strings are used to model acceleration waves. In this example, an ideal nail player generates an acceleration impulse that is input to the string. In this simple example, the collective excitation signal 52 is simply the sampled impulse response of the selected resonator.
[0027]
In a more complicated example, when the excitation signal is e (n) and the impulse response of the resonance part is r (n), the equivalent collective excitation signal a (n) is expressed by the following equation (Equation 1). It is given by the convolution of e (n) and r (n).
[Expression 1]

[0028]
If the collective excitation signal is long, it is desirable to shorten it by some technique. For this purpose, as described in various references related to signal processing, it is useful to first convert the signal a (n) to a minimum phase. In this way, the maximum shortening that matches the original magnitude spectrum is achieved. The signal a (n) can then be windowed by using any suitable part of the various window functions used for spectral analysis, for example. An example of a useful window is an exponential window. This is because the exponential function window has an effect that the damping ratio of the resonance part can be increased uniformly.
[0029]
As shown in FIG. 9, the excitation signal is set by recording a sound generated from a musical instrument (eg, a string nail-playing sound) and performing an inverse filtering process to remove a component due to a string loop. Also good. In FIG. 9, the string loop filter is set by one of various methods and is included in the inverse filter. The resulting output includes components corresponding to the nail flipper and the body filter and can be used as an excitation signal (or as a reference signal to obtain a modified excitation signal).
[0030]
FIG. 10 is a diagram showing an impulse response of a typical main body filter of a natural musical instrument. Basically, this impulse response is a dump vibration waveform. It is such a response that is stored as a collective excitation signal in the simplest case where the excitation signal is an impulse. In other examples where the excitation signal is other than an impulse, the collective excitation signal would be the result of convolution as described above. Since this convolution is due to an impulse response, the convolution result ends with a dump vibration waveform in either case. However, various shortening techniques provide excitation signals having waveforms other than the dump vibration waveform. Such a shortened excitation signal is obtained from the original impulse response (and provides a similar result to the original impulse response).
[0031]
The musical tone synthesis system can be used to simulate the input of excitation signals at different pick positions, i.e. different positions along the string. A particular pick position on the string is simulated by simultaneously exciting the string at two different positions along the delay line and adding to the component present at that point in the delay loop. This is shown in FIG. 13, where the delay circuit is divided into two

delay circuits

54, 56, and an adder 58 is inserted between these

delay circuits

54, 56. In general, the ratio of the delay time at the pick position to the delay time throughout the loop is equal to the ratio of the pick position to the chord length. The total delay length N by the

delay circuits

54 and 56 corresponds to a desired musical tone period corresponding to the selected pitch (where N = “delay amount corresponding to pitch” − “delay of loop filter”). amount"). Here, the delay amount P of the delay circuit 54 can be varied in accordance with the desired pick position, and accordingly, the remaining delay amount NP in the delay circuit 56 is varied.
[0032]
FIG. 14 shows an example in which basically the same effect as in FIG. 13 is realized by delaying an excitation signal and adding it to an undelayed excitation signal as a technique related to the above. In FIG. 14, unlike FIG. 13, a pick position delay circuit 60 and an adder 62 are provided separately from the delay loop. Similar to the above, the pick position delay circuit 60 can be varied to control the actual nail play point on the string.
[0033]
The musical tone synthesis system according to the present invention may be modified so as to provide many excitation signals in order to realize the effect of many sound emission points in a natural musical instrument. A person (listener) who listens to wooden and metal instruments receives signals from many sound emitting surfaces on the instrument. Therefore, different signals reach both ears. Furthermore, when the performer moves the instrument or the listener moves the head, the mixed sound emitted from the instrument changes dynamically. In order to handle such a natural phenomenon, it is useful to be able to generate many output signals corresponding to different output signals in the natural environment. According to the present invention, this is easily simulated by providing a number of collective excitation signals, each having a different component reflecting a different body filter or a different overall resonant system, as shown in FIG. Is possible. In FIG. 15, collective excitation signals 64, 66 are provided and provided to a single string delay loop 68 (individual string loops may be provided if separate outputs are desired). Although only two collective excitation signals are shown, any number of excitation signals may be provided to simulate crossfading at multiple different output points. Interpolation between two or more tables may be used. Two or more collective excitation signals 64, 66 may be interpolated as appropriate to provide a single string delay loop 68.
[0034]
An important change in the tone synthesis system is to read the excitation table semi-periodically. Instead of providing a single trigger signal for initiating a string claw sound, the trigger signal is provided periodically (or approximately regularly in view of vibrato). In this example, the amplitude of the excitation signal is reduced (eg, by shifting the table output value to the right or by adding an amplitude envelope to the table output value) to provide an appropriate output level. Can be small. This technique can simulate extremely high quality bowstrings.
[0035]
If a trigger signal is generated while the excitation table is being read, two variants are possible. First, the excitation table may be restarted from the beginning, and in this way the ongoing playback may be interrupted. This is illustrated in FIG. Further, the reproduction start of the new excitation table may be overlapped with the ongoing reproduction as shown in FIG. This modification is more complicated because it requires a separate execution pointer and adder for each regeneration of the excitation table. However, this is more preferable from the viewpoint of quality.
[0036]
In a useful variation, in addition to providing a mixed excitation signal as in FIG. 15, a plurality of excitation signals (table or other) are provided and can be varied in time as shown in FIG. Gain control is provided for each excitation signal. In FIG. 16, an excitation signal generator 70 generates M excitation signals. Each excitation output has a gain control element 72 that can be varied in time. The outputs of these gain control elements 72 are combined by an adder 74 to provide a collective excitation signal a (n). This signal is applied to a delay loop including a delay line 76 and a loop filter 78 via an adder 80. Thus, by providing a gain control element for each excitation signal, means are provided for synthesizing a wide range of excitation signals as a time-varying linear combination of fixed excitation signals. That is, each excitation signal is fixed, but the relative contribution ratio to the total excitation signal given to the delay loop can be controlled by controlling the relative gain of each excitation signal. The gain may be set to a specific value and held for the duration of one note, or to further change the characteristics of the musical tone that is occurring in addition to the changes caused by the filtered delay loop itself , May be changed over time.
[0037]
The gain gi (n) is typically fixed so that only one linear combination of the excitation signals is used in a free vibration, such as a nailing sound. On the other hand, in the case of drive vibration such as a bowed string, the gain gi (n) can be changed with time in order to change the characteristics of the musical sound. This can be achieved by supplying a smoothly varying envelope for each excitation signal and controlling the relative contribution of different excitation signals. The time change realized by changing the excitation signal is added to the time change realized in the delay loop with the filter.
[0038]
The characteristics of the various excitation tables can be selected to maximize the number of useful changes that can be realized from a fixed set of tables. For example, a set of excitation tables may include a number of waveform tables stored in a ROM in addition to a noise generator with a filter. The waveform table can provide various collective excitation signals considering different body filters, or the main elements (eg, frequency) of the overall desired excitation signal are provided separately in different waveform tables. It may also be based on an analysis of the main elements that are variably combined. This is similar to the well-known Fourier synthesis used for standard musical tone generation (but not for excitation signal generation for delayed loop musical tone synthesis).
[0039]
The musical tone synthesis system shown in FIG. 16 is useful for simulating a bowed string sound. In general, accurate simulation of such sounds requires a delay loop that has a nonlinear junction to capture the excitation signal and the signal that circulates the loop, and feeds back the signal according to a nonlinear function. is there. However, the tone synthesis system of FIG. 16 does not require the non-linear junction, and nevertheless, by using only a filtered delay loop and a time-varying excitation signal, High quality simulation can be realized. In this regard, each excitation signal itself is time-varying, but has a relatively short fixed duration. In order to generate a sustained tone, such as a simulated string of bowed strings, each excitation signal is repeated multiple times and the temporal change in the relative intensity of each excitation signal results in the desired tone change.
[0040]
FIG. 17 shows a modification that provides significant operational advantages. In general, the initial attack portion of a musical tone contains meaningful high frequency information. In order to properly synthesize the attack part in a normal delay loop with a filter, the sampling rate of the loop filter must be maintained at a relatively high rate. This is not the case for other parts of the synthesized musical sound that have less high frequency components.
[0041]
As shown in FIG. 17, the present invention reduces the computational requirements by providing a separate attack signal as one of the excitation signals and bypassing it around a delay loop with a filter. The attack signal includes a high-frequency signal having a short duration (for example, 100 ms) that is read in parallel with another excitation signal in response to the trigger signal. In FIG. 17, the attack signal is supplied at a point 82, gain-controlled by an amplifier 84, and given to an output summing junction 86. The additional excitation table 88 is appropriately weighted at 90 and summed at 92 to provide a composite excitation signal a (n). This combined excitation signal a (n) is input to a delay loop with a filter including a delay line 94, a loop filter 96 and an adder 98.
[0042]
Since there is no need to process high frequency components, the sampling rate in the loop filter 96 may be very low. For example, in the case where low pitched sounds such as the low E sound of a guitar are generated at low cost, the excitation signal input to the string loop may be limited to 1.5 kHz, and the recorded sound that is high-passed at 1.5 kHz. The first 100 msec may be used for the attack signal. Also, a sampling rate of 3 kHz may be used for the delay loop. The output signal of the loop may be upsampled to 22 kHz by the interpolation circuit 100 and added to the attack signal supplied at a sampling rate of 22 kHz as well. The composite excitation signal z (n) includes both the desired high frequency component and low frequency component, but nevertheless the processing of the delay loop is greatly simplified. The sampling rate of the string loop may be controlled as a function of pitch.
[0043]
The synthesis technique of the present invention can be applied to the synthesis of musical tones of vibraphones having a small number of exponentially damped resonance modes, and other percussion instruments such as tamtam, marimba, and koto. In these cases, the outputs of a plurality of delay loops with a filter are added together, so that the most important resonance mode can be imitated as the sum of a series of substantially harmonic vibration modes. This technique can also be applied to wind instruments. The excitation table in this case supplies an impulse response from the inside of the wind instrument tube to the outside of the sound hole and the morning glory part. It is difficult to obtain natural articulation because there is no non-linear junction (typically used for physical simulation of wind instruments) that provides an interaction between the musical sound waveform and the excitation signal . However, this technique can be implemented easily and at low cost.
[0044]
18 to 30 show another embodiment of the present invention. In this embodiment, by dividing the resonance part into the “dump” mode and the “ringy” mode, the size of the excitation table can be reduced, and thus the cost can be reduced. In this case, the least dumped resonance part is extracted, and only the remaining more dumped resonance part is rearranged with the string.
[0045]
As described above with respect to FIGS. 7 and 8, in the simplest example, the collective excitation table 52 is essentially the sampled impulse response of the selected resonator (eg, guitar body). In a more complex example, the collective excitation signal is given by performing the convolution shown in equation (1) above where the impulse response of the resonating part is convolved with the excitation signal e (n).
[0046]
The length of the convolution result, and hence the length of the impulse response r (n) of the resonator that affects the collective excitation signal stored in the table, is determined by its least dumped resonance. ing. This inventor replaces the arrangement order with the strings by removing the least damped resonances, ie, the long ringing (ie, ringing) mode (ie, ringing mode) from the more damped resonances. It has been discovered that the resonant part that will be produced will only have more dumped parts. This resonant part has a shorter impulse response.
[0047]
The long ringed parts that are not reordered can be simulated by a small number of two-pole filter parts or other cyclic filter structures. It should be noted that the present invention is by no means limited to implementation with a digital filter, and any suitable digital filter or analog filter may be used. Since most synthesizers today use a number of “additional” filters to provide post-processing effects on the generated musical signal, the present invention is used as the “ringy” part of the resonating part. By using these working filters, the excitation table required to store the collective excitation signal can be greatly simplified.
[0048]
FIG. 18A shows a musical tone generation mechanism of a guitar, for example, in a block diagram. In this example, when a trigger signal is supplied to the excitation source 30, the excitation source 30 An excitation signal e (n) for excitation is generated. The string portion 32 generates an output signal s (n) that excites the resonance portion 34 that generates the final output signal x (n). The characteristics of the resonating part 34 are the same as those described above with reference to FIGS.
[0049]
Instead of immediately replacing the resonating part 34 and the string 32 as in the embodiment described so far, the characteristic of the resonating part 34 is as shown in FIG. 102 and “Lingy” resonating section 104. Typically, the resonator 34 is first studied in the form of a measured impulse response. This measured impulse response is obtained, for example, by using a force hammer, 2-channel A / D conversion, and system identification software available to the MatLab ™ programming environment known to those skilled in the art. It's okay.
[0050]
One of the two A / D conversion channels records a force hammer output proportional to the striking force of the force hammer. The other A / D conversion channel records, for example, a microphone output that measures a response of the resonance unit to an operation by the hammer. The system identification software basically estimates the measured impulse response by deconvolving the force hammer “input” signal from the measured microphone “output” signal.
[0051]
A simple technique for realizing the deconvolution function is to obtain the measured frequency response of the resonator by dividing the Fourier transform of the “output” by the Fourier transform of the “input”. It is. As an alternative, a more sophisticated deconvolution process may be performed using a commercially available software package. The impulse response is obtained by performing an inverse Fourier transform of the frequency response using any of the above techniques.
[0052]
After the impulse response of the resonance unit 34 is determined, the most ringing mode of the impulse response is converted into a “parametric form”. That is, the exact resonant frequency and resonant bandwidth corresponding to each of the narrowest “peaks” in the frequency response of the resonant portion are identified and transferred to the “ringy portion” 104. The longest ringy mode corresponds to the narrowest bandwidth. Also, this longest ringy mode typically includes the highest peak in the frequency response.
[0053]
Thus, an effective technique for measuring the resonance with the longest ring time is to determine the precise position and bandwidth of the narrowest and highest spectral peak in the measured frequency response of the resonator 34. . The center frequency and bandwidth of the narrow frequency response peak determine the two poles in the ringing portion 104 of the resonator. Representing a filter with respect to its poles and zeros is a kind of “parametric” filter notation, unlike “nonparametric” representations such as impulse response or frequency response.
[0054]
As known to those skilled in the art, commercially available system identification software products are available that include software for converting measured frequency response peaks to a parametric form. Such commercially available system identification software products include force hammers and complete data collection means. Furthermore, those skilled in the art are familiar with the signal processing literature that has written about this issue. As an example, “Prony's method” is a classic technique for estimating the frequency and bandwidth for the sum of exponentially decaying sinusoids (impulse response of a dipole resonance). A more advanced recent technology is called the “matrix pencil method”.
[0055]
FIG. 19 illustrates a method for converting the ringy portion 104 of the resonating unit 34 into a parametric form, performed using a small Matlab program. In order to simplify the illustration, only one frequency response peak is shown in this example. First, the center frequency of the peak is measured using a peak finder for quadratic interpolation that operates based on the spectral scale in decibels. The generic filter design function “invfregz ()” is then called to design a two-pole filter with a frequency response that imitates the measured data as closely as possible.
[0056]
To determine parametric filter coefficients (as performed in the example of FIG. 19), a known “equation-error method” for designing digital filters can be used. As the filter design program focuses on spectral peaks, a weighting function is used (after renormalized to be overlaid on the diagram), also as shown in FIG. The weighting function used in this example is “1” in the range of 0 Hz to 900 Hz, “100” in the range of 900 Hz to 1100 Hz, and then returns to “1”. The weighting function of FIG. 19 appears as a rectangular function centered on a 1000 Hz spectral peak.
[0057]
Further, FIG. 19 shows a magnitude-frequency-response overlay of a two-pole filter designed by the above-mentioned equatorial error method. As shown, the goodness of fit between the non-parametric frequency response and the parametric frequency response is very high near the peak. By using the initially measured interpolated peak frequency, the pole angle of the desired filter can be finely adjusted, thus making the above-described Equation Error method a technique for measuring only the peak bandwidth in this case Can do. As known to those skilled in the art, there are a number of techniques for measuring spectral peaks in the signal processing field, and the invention is not limited to use in the illustrated technique. .
[0058]
Another method for converting the ringing portion 104 of the resonating portion 34 to a parametric form is to perform a polynomial factorization after using a well-known linear predictive coding (LPC) technique to determine the poles of the resonating portion. is there. The LPC is particularly excellent for imitating spectral peaks. The pole closest to the unit circle in the z plane can be selected for the ringy portion 104 of the resonating section 34.
[0059]
When using the LPC or any other “minimum phase” parametric form to implement the ringy portion 104, the corresponding “dump” portion 102 is a well-known process, particularly with respect to linear predictive coding and system identification. It is possible to calculate from the complete impulse response of the resonance unit 34 and the parametric or ringy portion 104 by using a process called “inverse filter process”.
[0060]
The inverse filter is constructed by providing an all-zero filter whose zero is equal to the pole of the ringy portion 104. If the ringy portion 104 has zeros, these zeros must be stable because they become the poles of the reverse filter. In the case of a digital filter, the zero needs to have a size of less than 1 in the z plane. In the case of an analog filter, the zero point must be located in the left half of the s plane. Such a filter is called a “minimum phase” filter. To reduce the likelihood of obtaining a non-minimum phase in the estimated parametric form of the ringy portion 104, a non-parametric scheme such as the known cepstolum “convolution” technique can be used to provide an initial impulse of the resonant portion. It is useful to convert the response to a minimum phase impulse response.
[0061]
In both digital and analog filters, if the zeros are non-minimum phase, these zeros are reflected around the appropriate frequency axis so that a minimum phase filter with the same frequency response magnitude can be achieved. There must be. Thus, the inverse filter is applied to the complete impulse response of the resonator to obtain a “residual” signal. This residual signal is an impulse response of the “dump portion” 102 and is suitable for changing the arrangement order with the string and convolving with a string excitation signal such as a “nailing sound” signal. When the residual signal is supplied to the ringy portion 104, that is, a parametric resonator (in this case, a minimum phase filter), the original impulse response of the resonator 34 can be realized with high accuracy. In general, the accuracy in this case is only affected by the numerical rounding error that occurs during the backward and forward filter operations.
[0062]
The inventor has determined that the all-pole filter is convenient and easy to operate. All-pole filters always have a minimum phase, and the LPC technique easily calculates them. As those skilled in the art will appreciate, there are many filter design techniques that can generate a parametric portion having a predetermined number of poles and zeros, and use a weighting function to ring the method to the longest of the impulse response 34. Can lead to ingredients. 19 is an example of a method that can calculate not only the poles but also the parametric, that is, the zeros of the ringy part. Thus, the parametric portion 104 may have any number of poles and zeros and may be implemented using any known filter implementation technique.
[0063]
Known digital filter implementation techniques include serial and parallel connection of secondary filter portions. As is known in the art, the transfer function of any linear time-varying (LTI) filter can be incorporated into a series connection of elementary secondary parts. Similarly, as is known in the art, all LTI filters are divided into parallel secondary part sums by a “partial fraction extension” operation. Each said secondary part can resonate only at one frequency or cannot resonate at all.
[0064]
FIG. 20 shows an embodiment (up to two poles, up to two zeros and one gain factor) called “Direct Form I” of a typical second order filter section. Although there are several other embodiments, this Direct Form I is a numerically preferred choice because the output of all multipliers is typically sent to a common adder that uses two's complement arithmetic It is. As a result, the overflow occurs only in one digit of the output. In order to obtain the best quality, the feedback signal (in the figure all these are to the right of the scaling factors b0 to b2) can be implemented with double accuracy. If the excitation table is scaled accordingly, the coefficient b0 in FIG. 20 can be removed. In the figure,
[Expression 2]

Is a unit delay element that realizes a delay of one unit (one sampling period).
[0065]
When the secondary filter unit resonates, its resonance frequency and resonance bandwidth are determined by feedback coefficients a1 and a2 according to the following equations.
a1 = -2 R cos (2 Pi Fr / Fs) (2)
a2 = R2 (3)
Where Fr is the resonance frequency in cycles / second or hertz (Hz), Fs is the digital audio sampling rate in Hz, Pi is 3.141 ..., R is Is the polar radius associated with the resonant bandwidth.
R = exp (-Pi Br / Fs) (4)
[0066]
The time constant of the damping Tr for the secondary resonance part is
Tr = 1 / (Pi Br) (5)
Is associated with bandwidth.
[0067]
The time constant is defined as the time (seconds) at which the impulse response of the resonance unit decays at a rate expressed by 1 / e = exp (-1). In this embodiment, it is necessary to identify the “most ringing” secondary part, ie, the part with the longest decay time Tr (or the smallest bandwidth Br). These parts may clearly be implemented as secondary parts of the parametric part 104 of the main body resonance part 34.
[0068]
FIG. 21 is a diagram showing a secondary resonance unit that has only two poles and is not as general as described above. This configuration can be used when the parametric part 104 is selected as an all-pole type and the secondary parts are connected in series.
Moreover, FIG. 22 is a figure which shows the serial connection of two 2 pole parts. Perhaps this is the most convenient choice, but the inventor has discovered that this is not as good in numerical performance as the example of FIG. 23 where the two secondary parts are connected in parallel. As will be appreciated by those skilled in the art, a suitable fractional extension of the transfer function results in parallel quadratic parts, each having at most one zero and up to two poles.
[0069]
In general, when digital filters generate disjoint resonances (ie, the resonances do not overlap in the frequency domain), it is numerically preferable to use a parallel secondary part rather than a serial secondary part. This takes into account that in order to obtain a resonant peak at a certain frequency when using a series secondary, it is also necessary to compensate for signal attenuation by all other non-resonant filter sections. Can be understood. On the other hand, when a parallel secondary part is used, the resonating secondary part basically performs a resonance operation independently. Thus, in general, the parallel secondary parts shown in FIG. 23 are numerically superior to the serial secondary parts shown in FIG. However, the parallel secondary parts are not convenient to calculate compared to the serial secondary parts, and a zero is required for each secondary part in order to properly match the phase of each output.
[0070]
The effect processors provided in most commercially available musical sound synthesizers usually have a “parametric equalizer section”. Typically, each of the parametric equalizer units is a secondary resonance unit as shown in FIG. 20 in which b0, b1, and b2 are limited to realize one gain control. Usually, the parameters of the equalizer section are the center frequency, bandwidth and gain for each equalizer section. For this reason, the parametric equalizer unit that is usually used to adjust the mixing of various frequency bands in the synthesized musical tone can also be used to realize a desired ringing mode of the main body resonance unit.
[0071]
When the decomposition of the resonance part is completed and the parametric part, that is, the ringy mode part 104 is realized, as shown in FIG. 18C in the above embodiment, the most dumped resonance of the resonance part 34 Is rearranged with the string 32. Next, in order to generate the collective excitation signal 106 as shown in FIG. 18D, the dump resonator 102 is convolved with the excitation signal 30 using, for example, the above equation (1).
[0072]
Thus, in this embodiment, the trigger signal is applied to the collective excitation signal 106, and the collective excitation signal 106 excites the string 32 via the excitation signal a (n). Thus, the string 32 processes the input signal and generates an output signal r (n) that does not include the long ring component of the resonating unit 34. On the other hand, these long ring components may be constituted by resonance filters, for example, a number of two-pole filter units connected in series or in parallel on the output side of the string 32 according to the nature of the signal being synthesized. It is supplied via the ringy mode resonance unit 104. As a result, the signal output from the resonance unit 104 is a musical sound signal.
[0073]
An additional advantage of having a separate parametric or ringy mode resonator 104 is that only one output signal is readily available in the unresolved resonator 34, whereas multiple output signals Will be available. The plurality of output signals can be used for variously improving the quality of the synthesized sound. As an example, the outputs of the secondary resonators connected in parallel can be “panned” in stereo at various positions. This panning can be selected to mimic the spatial distribution of the resonance modes of the simulated instrument. By slightly changing this stereo arrangement, a spatially moving instrument can be simulated.
[0074]
In order to realize various stereo arrangements of the individual resonating parts of the parametric resonating part, that is, the ringy mode resonating part 104, one adder that takes in the outputs of the two resonating parts shown in FIG. And the other is replaced by two adders for the “right channel”. In addition, for each of the resonating units, two scaling means such as multipliers are used, one scaling means scales the output signal before being summed to the left channel, and the other scaling means sums to the right channel. Scale the output signal before being processed. By adjusting the respective scaling factors, the amount of the output signal sent to the left and right channels will determine the stereo arrangement of the signal.
[0075]
When two scaling means are used as described above, one of the “b” coefficients (eg, b0a and b0b in FIG. 23) of each resonator can be omitted. For this reason, only one additional multiplier is required for a stereo arrangement. Also, since the angle of the stereo arrangement is often not a critical factor, the two scaling factors are 0, 1/8, 1/4, 3/8, 1/2, 5/8, 3 It may be a specially quantized number that does not require multiplication, such as a number that can only assume values of / 4, 7/8, 1. For example, multiplication by these numbers can be performed (with binary fixed point) using one or two shifts and zeros or one fixed point addition or subtraction.
[0076]
The size reduction of the excitation table that can be realized using the technique according to this embodiment is achieved by observing the ideal impulse response of the ideal guitar front and rear before and after the mode with the smallest dump amount is excluded. Can be illustrated.
FIG. 24 illustrates the initial impulse response of a simulated guitar body that resonates at 100 Hz. As is known, in guitars, generally the main resonance of the guitar body provides the longest ringing resonance, and therefore the ringing element with the least amount of dump of the body impulse response as shown in FIG. appear. As can be seen in FIG. 26, the excitation table can be shortened by one digit by removing this single, secondary, least-damped 100 Hz resonance component. In this example, the removed component can be produced by a single second order resonant filter 104 shown in FIG. 21 having a resonant frequency of 100 Hz, for example.
[0077]
27 to 29 are diagrams showing examples similar to the above using data measured from an actual classical guitar. FIG. 27 shows the estimated impulse response of the resonator 34 that has been converted to a minimum phase using known cepstolam “convolution” techniques. In this case, there are two long ringing low frequency resonances, one at 110 Hz and the other at 220 Hz. Since there are two ring resonances, each generating a spectral peak in the frequency response, the parametric resonator 104 needs to have at least four poles. FIG. 28 shows an impulse response of the four-pole parametric resonance unit 104 calculated by using the above-mentioned excitation error method. Although reverse filtering has been performed, a residual impulse response 102 is shown in FIG. The small noise bursts that appear at intervals of about 12 msec are related to the pitch of the guitar string that was excited to make the measurements shown and are not relevant to this example.
[0078]
The reduction in the size of the excitation table that can be realized according to this embodiment saves a considerable amount of the cost of the entire musical sound synthesizer. In addition, this embodiment uses a relatively simple resonant filter to implement a “ringy” mode resonator, and the filter is already present in most synthesizers currently in production. Cost is not necessary.
[0079]
The cost savings realized by extracting the longest ringing resonance mode of the impulse response of the resonance unit 34 into the parametric unit 104 include, among other things, (1) the duration of the impulse response and (2) the longest ringy resonance mode. The duration of the impulse response that remains after the operation, (3) the cost of the memory, and (4) the cost of realizing the secondary filter unit. According to current hardware trends, processors that provide faster processing are provided in increasingly compact configurations where only a small amount of memory is available locally. Decreasing memory usage at the expense of more processor utilization is a welcome trade-off.
[0080]
In this manner, this embodiment can eliminate the effect equivalent to the overall effect of the first embodiment, that is, an expensive and complicated main body filter that has been conventionally required. However, in this embodiment, by extracting the resonance part into a component based on the dump component, it is possible to maintain a resonance part that can be easily imitated using a resonance filter, and a considerably more complicated dump The part can be convolved with excitation to produce a collective excitation signal that provides downstream resonance characteristics.
[0081]
Furthermore, this technology eliminates most of the large-sized resonators and reduces the size of the excitation table while utilizing the ability of the synthesizer by realizing a ringy mode resonator using an existing resonance filter. Thus, the musical tone synthesis process can be simplified. This technique can be used with other embodiments including the use of multiple excitation tables as shown in FIGS.
[0082]
Although the embodiment described above with respect to FIGS. 18-29 can also be used with the delay loop of the above embodiment shown in FIGS. 3, 12, etc., FIG. 30 shows a self-flange chord or virtually detuned. An improved filtered delay loop capable of simulating a generated string is shown. As shown in FIG. 30, the input signal (for example, a (n)) is provided to the adder 108 and the one-cycle delay element 110.
[0083]
The delay element 110 generates two outputs. The first output is an output delayed by an amount proportional to the extraction point by being extracted from the mobile interpolated tap 112, i.e., a position that varies continuously along the delay element 110 line. . This output can be scaled by a scaling factor g, which can be a time-varying value. The output of the delay element 110 is also supplied to the adder 114, and thus the adder 114 feeds back to the low pass filter 116 and the next adder 108. The output of the movable interpolated tap 112 is supplied to an adder 118 provided downstream of the adder 114 and outside the delay loop.
[0084]
With the above configuration, the string synthesizer realizes several features that have the effect of requiring high costs. First, a flanged chord can be realized using a low-speed back-and-forth moving interpolated tap. Ideally, a large number of independently moving taps provides the best flange effect (eg, as shown by the dashed line in FIG. 30). Each tap adds a moving comb filter process to the output.
A single non-moving tap can provide the fixed comb filtering necessary to simulate the position of a nail, hit or other excitation operation along the string. In this case, the exact position of the physical excitation is not fully audible, so non-moving taps do not require interpolation.
[0085]
In addition to flange string simulation, a detuned “second string” can be simulated by using a faster unidirectional tap. In this case, the speed of the tap corresponds to the Doppler shift. The faster the moving tap moves to the right in FIG. 30, the slower the Doppler shift of all frequencies in the input signal. On the other hand, the faster the moving tap moves to the left in FIG. 30, the faster the Doppler shift of the frequency. In this embodiment, when the mobile tap reaches the end of the delay line, it needs to “circulate” to the other side in some way.
[0086]
In the simplest example, a simple circulation method can be used. By crossfading so that the output of one tap that has reached the inlet side end of the delay line fades in, while the output of the other tap that has reached the outlet side end of the delay line fades to zero. A better sound can be obtained. Thus, in this case, two mobile non-interpolating taps are active during the crossfade. A more elaborate cycle is to look for a suitable jump along the delay line. For example, if possible, the wraparound may be performed at a zero cross point, or correlation effects at various delay points may be calculated. All of these techniques are to some extent known in the context of the “harmonizer” and “pitch shift” algorithms. Other detuned strings can be simulated by adding taps with different tap speeds. By creating a large number of virtual strings with slightly different tunings that vary in time in this way, a pleasant “chorus effect” is obtained.
[0087]
Flanging and Doppler shifts can be used to mimic the coupler effect of vibrating strings. Such coupling produces a slow beat in the amplitude envelope of the overtone of the ringing sound. A movable comb filter (having a notch that is not very deep) can produce a qualitatively similar effect by flanging. As an alternative, an equivalent effect is achieved by adding the string output to a virtually detuned string. As a specific example, if the scaling factor g in FIG. 30 is set to 0.25 and the tap speed is set to produce a Doppler frequency shift of 0.25%, the sum between two strings that are slightly mistuned is Produces beats similar to those found in electric guitars.
[0088]
The simulation of the multi-string coupler sound required in almost all string instrument sound synthesis is cost effective when using the present invention. In the past, in order to simulate a similar multi-string coupler sound, a plurality of string simulators were required. However, in this invention, using only one string simulator, the multi-string coupler effect is added by one or more mobile taps.
[0089]
As a modified example of the above-described simulation of the string coupler, a connection of two delay loops with filters each simulating the string shown in FIG. 31 may be used. In this variation, the outputs of each filtered delay loop are summed and the sum signal is scaled using a negative coefficient, preferably having a magnitude of about 0.01 or less. As shown in FIG. 31, in order to perform a more accurate simulation, the negative coefficient is replaced by a transfer function −Hb (z) that can be calculated from a measured or theoretically predicted connection characteristic by a filter. It is done. This scaled or filtered signal is feedback added to each of the filtered delay loops via a feedback path, preferably to the loop at a position immediately after the output is taken from the loop. be introduced. The output used for the coupler sound may be taken from any position in the loop.
[0090]
Such a true coupler method is performed as follows for N strings. The outputs of N filtered delay loops (corresponding to N strings) are summed, and the summed signal is scaled by "-epsilon" (or filtered by -Hb (z)) and this scaling A filtered or filtered signal is preferably introduced into each said loop via a feedback path. The scaled (or filtered) signal represents the physical interpretation of the “bridge” output. The scaled signal, or in most cases the sum signal before scaling, provides an excellent choice for the collective output of connected string assemblies.
[0091]
Unlike the case where negative coefficients are used to implement a true coupler, all the coupled (coupled) filtered delay loops when the -Hb (z) filter is used The loop filter at can be removed. That is, the coupling filter -Hb (z) can be used to provide all the filtering required for all connected filtered delay loops. If a separate loop filter is not used, the coupling filter can be regarded as a shared loop filter.
[0092]
In addition, these effects (ie, flanges, detuned strings, choruses, virtual couplers, authentic couplers, etc.) can be used in any synthesis technique that uses a filtered delay loop. Examples of this are wave guide synthesis of wind instruments, brass instruments and percussion instruments.
[0093]
In summary, according to the present invention, by providing an excitation signal that corresponds to a triggered (optionally processed) impulse response, high quality can be achieved without the need for expensive and complex body filters. You can synthesize musical sounds that are “nails played”, “struck”, and “played with a bow”. The characteristic of the resonant system provided downstream of the vibrating element such as a string is provided by appropriately generating an excitation signal that takes into account the impulse response of the downstream resonant system.
[0094]
The configuration of the tone synthesis technique according to the present invention is greatly simplified as compared with a conventional system that requires a complicated main body filter. Furthermore, by disassembling the resonance part into a dump mode and a ringy mode, and then replacing the dump mode with a string and convolving with an excitation signal, the complicated and expensive main body filter used conventionally can be removed and an excitation table , And the cost associated with the excitation table can be reduced. Flange and chorus effects, and virtually detuned strings, can be realized by moving along the delay line of the filtered delay loop and adding an interpolated tap that is summed with the output.
[0095]
【The invention's effect】
As described above, the present invention has an excellent effect that a high-quality musical tone can be synthesized with a simple configuration and at a low cost.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a conventional tone synthesis system based on a delay loop with a filter using a main body filter.
FIG. 2 is a block diagram showing a musical tone synthesis system in which the arrangement order of a delay loop with a filter and a main body filter is changed.
FIG. 3 is a block diagram showing a tone synthesis system according to an embodiment of the present invention, in which a collective excitation signal corresponding to the impulse response of the main body filter is given.
FIG. 4 is a block diagram showing a musical tone generation mechanism of a guitar.
FIG. 5 is a block diagram showing musical sound generation by the guitar and the surrounding space.
FIG. 6 is a block diagram showing a musical tone generation mechanism of a piano including the surrounding space.
FIG. 7 is a block diagram showing an equivalent musical sound generation mechanism in which a resonance unit is arranged in front of a string.
FIG. 8 is a block diagram showing a musical tone generation mechanism in which a response of the resonance unit to a specific excitation is used as a collective excitation signal.
FIG. 9 is a block diagram showing an inverse filter method for setting an excitation signal.
FIG. 10 is a diagram illustrating an example of an excitation signal corresponding to an impulse response of a main body filter.
FIG. 11 is a diagram showing an example of an excitation signal that is repeatedly supplied to realize sustained tone generation.
FIG. 12 is a diagram showing another example of an excitation signal that is repeatedly supplied in order to realize sustained musical tone generation.
FIG. 13 is a block diagram showing a musical sound generating system capable of simulating a change in pick position.
FIG. 14 is a block diagram illustrating an equivalent system for changing the position of a pick.
FIG. 15 is a block diagram illustrating a system using two excitation tables that are scaled and summed to generate a final excitation signal.
FIG. 16 is a block diagram showing a musical sound synthesis system incorporating a time-varying mixed excitation signal generator.
FIG. 17 is a block diagram showing a musical sound synthesis system incorporating an excitation signal generator for generating an attack component outside the delay loop.
18A and 18B are block diagrams showing a musical tone generation mechanism of a guitar, in which FIG. 18A shows a general example of the musical tone generation mechanism, and FIG. (C) shows an example in which the dump mode part is provided in front of a string, and (D) shows an example in which an excitation signal is convoluted with the dump mode part.
FIG. 19 shows an overlay of non-parametric frequency response, parametric frequency response fit and weighting function.
FIG. 20 is a diagram showing a configuration example for realizing a digital resonance unit.
FIG. 21 is a diagram showing another configuration example for realizing a digital resonance unit.
FIG. 22 is a diagram showing another configuration example for realizing a digital resonance unit.
FIG. 23 is a diagram showing another configuration example for realizing a digital resonance unit.
FIG. 24 shows a simulated impulse response of a guitar body that resonates at 100 Hz.
FIG. 25 is a diagram showing the longest ringing mode (the least dumped component) of the impulse response shown in FIG. 24;
26 shows the initial impulse response shown in FIG. 24 with the least dumped component of FIG. 25 removed.
FIG. 27 is a diagram showing an impulse response of the guitar body calculated from data measured from an actual guitar.
FIG. 28 is a diagram showing parametric estimates of two longest ringing modes in the impulse response shown in FIG. 27;
29 shows the impulse response of FIG. 27 with the parametric component of FIG. 28 removed using inverse filtering.
FIG. 30 is a block diagram showing a tone synthesis system in which a delay loop with a filter is configured by a moving interpolation tap to synthesize a self-flange string or a virtually detuned string.
FIG. 31 is a block diagram showing an embodiment of a string coupler simulation technique.
[Explanation of symbols]
20 tables
26 Delay line
28 Loop filter
32 strings
34 Resonant part

Claims

Excitation means for generating first and second excitation signals;
A sound of a musical tone that includes an input unit that receives the first excitation signal and a delay unit that delays the signal in a closed loop and includes an output unit that extracts an output from the closed loop. Closed loop means corresponding to high, the sampling rate of the closed loop means being lower than the sampling rate of the second excitation signal;
Means for matching the sampling rate of the signal extracted from the output of the closed loop means to the sampling rate of the second excitation signal;
A musical tone synthesis system comprising synthesis means for synthesizing the second excitation signal and the signal extracted from the output unit and whose sampling rate is matched with the second excitation signal .