JP4012410B2

JP4012410B2 - Musical sound generation apparatus and musical sound generation method

Info

Publication number: JP4012410B2
Application number: JP2002035406A
Authority: JP
Inventors: 哲夫西元
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2002-02-13
Filing date: 2002-02-13
Publication date: 2007-11-21
Anticipated expiration: 2022-02-13
Also published as: JP2003233378A

Description

【０００１】
【発明の属する技術分野】
本発明は、サンプリングして波形メモリに記憶した楽音波形データを読み出し、この楽音波形データに基づいて楽音を生成する楽音生成装置および楽音生成方法に関する。
【０００２】
【従来の技術】
楽器の音をサンプリングし、楽音波形データとして波形メモリに記憶し、この楽音波形データを読み出して楽音を生成する、いわゆる波形テーブル（ウェーブテーブル）音源は、従来から知られている。
【０００３】
【発明が解決しようとする課題】
上記従来の波形テーブル音源では、波形メモリの記憶容量を低減するために、当該音源システム本来のサンプリング周波数、具体的には、波形メモリに記憶された楽音波形データを読み出して楽音を生成するときの周波数よりも低い周波数で、楽音をサンプリングすることがある。以下、当該音源システム本来のサンプリング周波数を「サンプリング周波数」と言い、楽音をサンプリングして楽音波形データを生成するときのサンプリング周波数を「録音サンプリング周波数」という。
【０００４】
録音サンプリング周波数は、単位時間（１秒）当たりの楽音波形サンプル（楽音波形データを構成する個々の波形サンプル）の個数に相当するので、録音サンプリング周波数を低くすればするほど、各楽音波形データの容量は少なくなり、その結果、波形メモリの記憶容量は低減する。
【０００５】
しかし、サンプリングの定理によれば、録音サンプリング周波数は、再現可能な楽音の上限周波数（＝録音サンプリング周波数の半分）を決定づけるため（たとえば、録音サンプリング周波数を１６ＫＨｚとした場合には、８ＫＨｚまでの周波数の楽音しか再現できない）、換言すると、録音サンプリング周波数を低く抑えた場合には、高次高調波成分の失われた楽音しか再現できないため、この場合に再現される楽音の音質は劣化することになる。
【０００６】
本発明は、この点に着目してなされたものであり、録音サンプリング周波数をサンプリング周波数より低くして楽音波形データを生成した場合でも、その失われた高調波成分を回復することにより、より自然な楽音を生成することが可能な楽音生成装置および楽音生成方法を提供することを目的とする。
【０００７】
【課題を解決するための手段】
上記目的を達成するため、請求項２に記載の楽音生成装置は、録音サンプリング周波数で、楽器の音をサンプリングして採取した楽音波形データを音色データメモリに記憶し、該音色データメモリから楽音波形データを読み出して楽音を生成する楽音生成装置において、前記楽器の音を、前記録音サンプリング周波数より高い周波数でサンプリングして採取し、第１の楽音波形データとして第１のメモリに記憶させる第１の記憶手段と、前記第１のメモリに記憶された第１の楽音波形データを前記録音サンプリング周波数でダウンサンプリングして得られたデータを、第２の楽音波形データとして第２のメモリに記憶させる第２の記憶手段と、前記第１の楽音波形データを前記録音サンプリング周波数でダウンサンプリングすることによって失われる周波数域の周波数スペクトルに基づいたフォルマントデータを前記第１の楽音波形データから生成するフォルマントデータ生成手段と、前記第２の楽音波形データと前記フォルマントデータ生成手段によって生成されたフォルマントデータを対応付けて前記音色データメモリに記憶させる第３の記憶手段と、前記音色データメモリに記憶された第２の楽音波形データから楽音波形サンプルを生成する楽音波形サンプル生成手段と、前記音色データメモリに記憶された、前記第２の楽音波形データに対応付けられたフォルマントデータに基づいてフォルマント音を合成する合成手段と、前記生成された楽音波形サンプルに前記合成されたフォルマント音を加算して出力する加算手段とを有することを特徴とする。
【０００８】
好ましくは、前記フォルマントデータは、当該フォルマントの中心周波数とそのレベルであり、前記合成手段は、ホワイトノイズを発生するホワイトノイズ発生手段と、該ホワイトノイズ発生手段によって発生されたホワイトノイズのスペクトル特性を変更する変更手段と、前記中心周波数の正弦波を発生する正弦波発生手段と、前記変更手段からの出力に、前記正弦波発生手段によって発生された正弦波を乗算する乗算手段と、該乗算手段からの出力のレベルが前記レベルになるように調整する調整手段とからなることを特徴とする。
【０００９】
また、上記目的を達成するため、請求項１に記載の楽音生成方法は、録音サンプリング周波数で、楽器の音をサンプリングして採取した楽音波形データを音色データメモリに記憶し、該音色データメモリから楽音波形データを読み出して楽音を生成する楽音生成方法において、前記楽器の音を、前記録音サンプリング周波数より高い周波数でサンプリングして採取し、第１の楽音波形データとして第１のメモリに記憶し、前記第１のメモリに記憶された第１の楽音波形データを前記録音サンプリング周波数でダウンサンプリングして得られたデータを、第２の楽音波形データとして第２のメモリに記憶し、前記第１の楽音波形データを前記録音サンプリング周波数でダウンサンプリングすることによって失われる周波数域の周波数スペクトルに基づいたフォルマントデータを前記第１の楽音波形データから生成し、前記第２の楽音波形データと前記生成されたフォルマントデータを対応付けて前記音色データメモリに記憶し、楽音を生成する場合には、前記音色データメモリに記憶された第２の楽音波形データから楽音波形サンプルを生成し、前記音色データメモリに記憶された、前記第２の楽音波形データに対応付けられたフォルマントデータに基づいてフォルマント音を合成し、前記生成された楽音波形サンプルに前記合成されたフォルマント音を加算して出力することを特徴とする。
【００１０】
【発明の実施の形態】
以下、本発明の実施の形態を図面に基づいて詳細に説明する。
【００１１】
図１は、本発明の一実施の形態に係る楽音生成装置を適用した携帯電話機の概略構成を示すブロック図である。
【００１２】
同図に示すように、制御部１は、当該携帯電話機全体の制御を司るＣＰＵと、該ＣＰＵが実行する制御プログラムや、各種テーブルデータ等を記憶するＲＯＭと、着信メロディ等を演奏するための演奏データ（たとえば、ＭＩＤＩ（Musical Instrument Digital Interface）データからなる）、各種入力情報および演算結果等を一時的に記憶するＲＡＭとによって構成されている。
【００１３】
制御部１には、テンキーや各種情報を入力するための操作子を備えた操作入力部２と、たとえばカラー液晶ディスプレイ（ＬＣＤ）および発光ダイオード（ＬＥＤ）等を備えた表示部３と、アナログ音声信号をデジタル音声信号に変換した後圧縮し、逆に、圧縮されたデジタル音声信号を伸長した後アナログ音声信号に変換する音声ＣＯＤＥＣ５と、音声ＣＯＤＥＣ５からの出力信号を変調し、アンテナ７を介して中継局（図示せず）に伝送するとともに、アンテナ７を介して中継局から伝送されてきた信号を受信して復調し、音声ＣＯＤＥＣ５に出力する通信部４と、後述する音色データメモリ（波形メモリ）を含み、該音色データメモリから目的の楽音波形データを読み出し、各種処理を施してデジタル楽音信号を生成し、図示しないＤＡＣ（Digital-to-Analog Converter）によりアナログ楽音信号に変換して出力する波形テーブル音源６とが接続されている。
【００１４】
音声ＣＯＤＥＣ５には、該音声ＣＯＤＥＣ５から出力されたアナログ音声信号を音響に変換するためのスピーカ８と、音声をアナログ音声信号に変換するマイク９とが接続されている。
【００１５】
波形テーブル音源６には、該波形テーブル音源６から出力されたアナログ楽音信号を音響に変換するためのスピーカ１０が接続されている。
【００１６】
本発明の特徴は、波形テーブル音源６、特にその楽音信号生成処理にある。具体的には、音色データメモリには、サンプリング周波数より低い録音サンプリング周波数で採取した楽音波形サンプルが記憶されるため、この楽音波形サンプルをそのまま読み出しただけでは、録音サンプリング周波数の半分以上の周波数成分は再現できない、つまり、高次高調波成分の失われた楽音しか再現できない。この問題に対処するために、原音において、録音サンプリング周波数の半分以上の高域に出現する複数個（本実施の形態では、４個）のフォルマントを検出し、該各フォルマントのフォルマント中心周波数およびフォルマントレベルをそれぞれ抽出し、当該楽音波形サンプルに対応付けて音色データメモリに記憶しておく。そして、目的の楽音波形サンプルを読み出して、楽音を生成するときには、この楽音波形サンプルに対応付けられた各フォルマント中心周波数およびフォルマントレベルを読み出し、該各フォルマント中心周波数およびフォルマントレベルの複数フォルマントを備えたフォルマント音を近似的に生成して、目的の楽音波形サンプルに付加するようにしている。
【００１７】
このように、本発明の特徴となる楽音信号生成処理を行うべき前提として、上記複数のフォルマントを検出し、該各フォルマントのフォルマント中心周波数およびフォルマントレベルをそれぞれ抽出するとともに、サンプリング周波数よりも低い録音サンプリング周波数で楽音をサンプリングして、楽音波形サンプルを生成し、この楽音波形サンプルとこれに対応する各フォルマント中心周波数およびフォルマントレベルとからなる楽音波形データを音色データメモリに登録しておかなければならない。なお、上記複数のフォルマントを検出する処理でも、楽音波形サンプルを生成する処理でも、同一の原音をサンプリングしなければならない。ただし、両者では、サンプリングに用いるサンプリング周波数が異なり、前者は、後者の倍のサンプリング周波数でサンプリングする必要がある。このため、本実施の形態では、前者のサンプリング周波数、すなわち録音サンプリング周波数の２倍の周波数で、基となる楽音を１回のみサンプリングし、後者の処理では、それをダウンサンプリングしたものを用いるようにしている。
【００１８】
以下、このサンプリングから音色データメモリへの登録までの処理、すなわち音色データメモリ作成処理を説明する。
【００１９】
図２は、音色データメモリ作成処理の手順を示すフローチャートである。なお、本音色データメモリ作成処理は、たとえばパーソナルコンピュータ上で実行される。
【００２０】
同図において、まず、楽音波形サンプルを採取、つまり原音をサンプリングする（ステップＳ１）。
【００２１】
図３は、ステップＳ１の楽音波形サンプルの採取処理を説明するための図であり、パーソナルコンピュータ１００によってシンバル音の楽音波形サンプルを採取する方法を示している。
【００２２】
同図に示すように、まず、スティックでシンバルを叩くことにより、シンバル音を発生させ、このシンバル音をマイク１０１でアナログ楽音信号に変換する。
【００２３】
次に、録音サンプリング周波数Ｗｆｓ以下の周波数成分のみ通過させるＬＰＦ（ロウパスフィルタ）１０２により、このアナログ楽音信号に含まれる、Ｗｆｓ以上の周波数成分をカットする。このように、Ｗｆｓ以上の周波数成分をカットしたのは、基となる楽音波形サンプルを生成するためのサンプリング周波数（＝２Ｗｆｓ：録音サンプリング周波数の倍）の半分（＝Ｗｆｓ）以上の周波数成分の楽音は、前述のように再現できないので、Ｗｆｓ以上の周波数成分は必要ないからである。
【００２４】
次に、ＬＰＦ１０２からの出力信号を、サンプリング周波数２ＷｆｓのＡ／Ｄ変換器１０３によってデジタル楽音信号に変換し、このデジタル楽音信号をメモリ（たとえばＲＡＭ）１０４に記憶する。
【００２５】
以上の処理を所定時間、たとえば１秒間繰り返すと、２Ｗｆｓ個の楽音波形サンプルがメモリ１０４に記憶される。この楽音波形サンプルの時系列から１サンプルずつ間引くことによってダウンサンプリングした後の時系列サンプルが、前記音色データメモリに記憶すべき楽音波形サンプルである。
【００２６】
なお、音色データメモリには、複数の音色の楽音波形サンプルが登録されるので、シンバル音以外の楽器音についても、以上説明した方法と同様の方法によって、その基となる楽音波形サンプルを採取すればよい。
【００２７】
図２に戻り、ステップＳ１で採取された楽音波形サンプルに対して、加工・編集処理を施す（ステップＳ２）。加工・編集処理としては、たとえば、ゼロレベル（または低レベル）波形サンプルの削除処理、各楽音波形サンプルの振幅調整処理、前記ダウンサンプリング処理および基となる楽音波形サンプルの高域分析処理等を挙げることができる。
【００２８】
図４は、基となる楽音波形サンプル、ダウンサンプリング後の楽音波形サンプルおよび検出後のフォルマントの各周波数スペクトルの様子を示す図であり、（ａ）は、基となる楽音波形サンプルの周波数スペクトルの一例を示し、（ｂ）は、（ａ）の楽音波形サンプルを１／２にダウンサンプリングした後の周波数スペクトルを示し、（ｃ）は、（ａ）の楽音波形サンプルから検出した４つのフォルマントの周波数スペクトルを示している。
【００２９】
前述のように、基となる楽音波形サンプルは、原音を録音サンプリング周波数Ｗｆｓの倍の周波数でサンプリングすることによって生成されているので、その周波数スペクトルは最大、（ａ）に示すように、録音サンプリング周波数Ｗｆｓまで出現する。
【００３０】
この基となる楽音波形サンプルの時系列をダウンサンプリングフィルタによりＷｆｓ／２以上の周波数成分をカットした後に１サンプルずつ間引くことによってダウンサンプリングすると、これは、ちょうど原音を録音サンプリング周波数Ｗｆｓでサンプリングした場合に相当するので、そのダウンサンプリング後の楽音波形サンプルの周波数スペクトルは、（ｂ）に示すように、（ａ）の周波数スペクトルのうち、Ｗｆｓ／２〜Ｗｆｓの周波数成分がカットされたものとなる。このダウンサンプリング後の楽音波形サンプルを、後述するように、音色データメモリに登録する。
【００３１】
前記基となる楽音波形サンプルの時系列から選択された一部（本実施の形態では、１フレーム分）の楽音波形サンプルに対してＦＦＴ（Fast Fourier Transform）を施すと、（ａ）の周波数スペクトルに似た周波数スペクトルが得られる（（ａ）の周波数スペクトルは、上述のように、基となる楽音波形サンプルの時系列すべてから得られた周波数スペクトルとしているため）。この周波数スペクトルのうち、Ｗｆｓ／２〜Ｗｆｓの周波数域を分析することにより、４つのフォルマントを検出する。（ｃ）は、この検出されたフォルマントの周波数スペクトルを示している。そして、検出された、４つのフォルマントの各フォルマントから、その中心周波数とレベルを抽出する。具体的には、（ｃ）では、Ｆ１〜Ｆ４のフォルマント中心周波数とＬ１〜Ｌ４のフォルマントレベルが抽出される。この抽出されたフォルマント中心周波数およびフォルマントレベルが、１フレーム分の楽音波形サンプルに対応付けられて、音色データメモリに記憶される。
【００３２】
なお、本実施の形態では、フォルマント中心周波数およびフォルマントレベルは、すべての楽音波形サンプルのうち、所定の複数個の楽音波形サンプル毎に抽出するようにしたが、これに限らず、すべての楽音波形サンプルに共通のものを１組だけ抽出するようにしてもよい。
【００３３】
図２に戻り、ステップＳ２で加工・編集された楽音波形データを音色データメモリに登録する（ステップＳ３）。
【００３４】
図５は、音色データメモリのメモリマップの一例を示す図であり、本実施の形態では、１音色分の音色データは、基本音色データと楽音波形データとによって構成されている。そして、音色データメモリには、複数の音色データが登録されるため、音色データメモリの先頭アドレスには、音色番号（Ｎｏ）（たとえば、ＧＭ（General MIDI）システムフォーマットでの音色番号）を、その音色番号に対応する基本音色データが記憶されている領域の先頭アドレスに変換する変換テーブルが記憶されている。
【００３５】
１音色分の基本音色データは、その音色名、当該基本音色データのデータ長、録音サンプリング周波数Ｗｆｓ、その楽音波形データの先頭サンプルのアドレスを示す波形スタートアドレス、その楽音波形データのうち、ループ読みされる領域の先頭サンプルのアドレスを示す波形ループスタートアドレス、ループ読みされる領域の末尾サンプルのアドレスを示す波形ループエンドアドレス、その楽音波形データの末尾サンプルのアドレスを示す波形エンドアドレス、その楽音波形データのエンベロープを決定するエンベロープデータおよびその他データによって構成されている。
【００３６】
なお、本実施の形態では、基本音色データと楽音波形データは、それぞれ別の領域に記憶するようにしているため、基本音色データ中に、当該音色の楽音波形データを記憶している領域がどこからどこまでであるかを示すデータ、すなわち波形スタートアドレスおよび波形エンドアドレスを含むようにしている。しかし、基本音色データに続いて、対応する楽音波形データを記憶するように構成した場合には、基本音色データ内のデータ長から、楽音波形データの先頭アドレスは算出できるので、波形スタートアドレスを記憶しないようにしてもよい。
【００３７】
一方、１音色分の楽音波形データは、前述のように、基となる楽音波形サンプルの時系列をダウンサンプリングしたものと、１フレーム分の楽音波形サンプルに対して高域分析して抽出した、４つのフォルマントの各フォルマント中心周波数およびフォルマントレベルからなるフォルマントデータとによって構成されている。なお、１フレームとは、たとえば１０ｍｓｅｃ間の楽音波形サンプルを言い、録音サンプリング周波数Ｗｆｓであれば、Ｗｆｓ×０．０１個分の楽音波形サンプルに相当する。
【００３８】
前記ステップＳ３の音色データ登録処理では、上記図５の音色データメモリのフォーマットに適合するように、前記ダウンサンプリングされた楽音波形サンプルおよび前記抽出されたフォルマントデータからなる楽音波形データを登録する。具体的には、この楽音波形データを楽音波形データ格納領域に記憶するとともに、その基本音色データを作成して基本音色データ格納領域に記憶し、さらに、その基本音色データの先頭アドレスを、音色番号→先頭アドレス変換テーブルの、対応する位置に記憶する。
【００３９】
なお、本音色データ登録処理では、パーソナルコンピュータ１００のメモリ（前記メモリ１０４と同じであってもよい（ただし、この場合には領域は異なる必要がある）し、異なる記憶媒体であってもよい）上に、図５の音色データメモリと同容量の領域を生成し、この領域上に、上記楽音波形データを登録するようにしている。
【００４０】
続くステップＳ４では、波形採取すべき、他の音色の楽音があるか否かを判別し、あるときには、ステップＳ１に戻って、波形採取処理から繰り返す一方、ないときには、ステップＳ５に進む。
【００４１】
ステップＳ５では、上記メモリの所定領域内に登録された音色データと同じメモリマップのものを、たとえばＲＯＭからなる音色データメモリ上に生成する、すなわち音色データメモリ化する。
【００４２】
以上のように構成された携帯電話機が実行する制御処理を、まずその概要を説明し、次に図６および図７を参照して詳細に説明する。
【００４３】
本実施の形態の携帯電話機は、音色データメモリの記憶容量を低減するために、サンプリング周波数より低い録音サンプリング周波数で楽音波形サンプルを採取して、音色データメモリに記憶するとともに、これによって失われる、録音サンプリング周波数の半分から録音サンプリング周波数までの高域のフォルマントデータ（フォルマント中心周波数およびフォルマントレベル）を抽出し、当該楽音波形サンプルに対応付けて、音色データメモリに記憶する。楽音を生成するときには、音色データメモリから目的の楽音波形サンプルを読み出すとともに、この楽音波形サンプルに対応するフォルマントデータも読み出し、このフォルマントデータに基づいて近似的に生成した、複数のフォルマントを備えたフォルマント音を、先に読み出した楽音波形サンプルに付加することで、楽音信号を生成するようにしている。これにより、楽音波形サンプルを採取したときに失われた、録音サンプリング周波数の半分以上の周波数成分、つまりフォルマントが近似的に回復するので、より自然な楽音を生成することができる。
【００４４】
ここで、フォルマントデータからフォルマント音を生成する方法は、本出願人が先に出願した、特開平２−２７１３９７号公報に記載の方法をそのまま用いている。したがって、その詳細な説明は、該公報に譲るが、本発明を実施するために必要な範囲内で、後述する。
【００４５】
次に、この制御処理を詳細に説明する。なお、本発明の特徴は、上述のように、専ら楽音信号生成処理、つまり前記波形テーブル音源６での楽音信号生成処理にあるため、以下、波形テーブル音源６でなされる制御処理（楽音信号生成処理）について説明する。
【００４６】
図６は、波形テーブル音源６でなされる楽音信号生成処理の手順を示すブロック図である。なお、波形テーブル音源６は、通常ＤＳＰ（digital signal processor）によって構成されるため、その制御処理の大半は、ソフトウェアによってなされている。もちろん、波形テーブル音源６をすべてハードウェアによって構成するようにしてもよい。
【００４７】
図６において、パラメータ生成部２１には、生成すべき楽音を示すＭＩＤＩデータが入力されるとともに、音色データメモリ２５の記憶内容を読み出すメモリ読み出し部２４が接続されている。
【００４８】
パラメータ生成部２１は、ＭＩＤＩデータが入力されると、まず、そのＭＩＤＩデータを解析して、音色番号やキーコード等の情報、すなわち楽音生成用パラメータを生成するのに必要な情報を抽出する。次に、パラメータ生成部２１は、メモリ読み出し部２４を介して、音色データメモリ２５の先頭アドレスにアクセスし、その位置に記憶されている、前記音色番号→先頭アドレス変換テーブルを用いて、上記抽出した音色番号に対応する基本音色データの先頭アドレスを検索し、その位置に記憶されている、１音色分の基本音色データを読み出し、この基本音色データに基づいて楽音生成用パラメータを生成する。
【００４９】
このとき生成されるパラメータは、具体的には、波形スタートアドレス、波形エンドアドレス、波形ループアドレス（波形ループスタートアドレスおよび波形ループエンドアドレス）およびエンベロープデータである。これらのパラメータすべては、当該基本音色データに含まれるデータそのものである。
【００５０】
また、パラメータ生成部２１は、抽出した音色番号およびキーコードと、録音サンプリング周波数と、サンプリング周波数とに基づいて、１楽音波形サンプルあたりのアドレスの進み量を表すパラメータであるＦナンバ（ＦＮ）と、オクターブを示すパラメータであるＯＣＴパラメータとを生成する。
【００５１】
さらに、パラメータ生成部２１は、フォルマントデータ（フォルマント中心周波数およびフォルマントレベル）を生成する。前述のように、フォルマントデータは、１フレーム毎に異なったものが記憶されているので、１フレーム分の楽音波形サンプルの読み出しが終了すると、パラメータ生成部２１は、メモリ読み出し部２４を介して、音色データメモリ２５から、次のフレームに対応するフォルマントデータを読み出して生成すればよい。
【００５２】
各種クロック生成部２２は、供給された基本クロックを、たとえば分周することにより、複数の周波数のクロックを生成して出力する。このクロックの主なものは、波形テーブル音源６本来のサンプリング周波数Ｓｆｓのクロックであり、このクロックによって、波形テーブル音源６の各ブロック２３〜２９の動作が進行する。
【００５３】
アドレス生成部２３には、パラメータ生成部２１によって生成された、Ｆナンバ、ＯＣＴパラメータ、波形スタートアドレス、波形エンドアドレスおよび波形ループアドレスが入力され、アドレス生成部２３は、これらのパラメータに基づいて、音色データメモリ２５から読み出すべき楽音波形サンプルのアドレスを生成する。
【００５４】
なお、本発明では、録音サンプリング周波数はサンプリング周波数より低いので、アドレス生成部２３によって生成されるアドレスは、通常、小数部を含む実数値となる。すなわち、アドレス生成部２３では、音色データメモリ２５に記憶された楽音波形サンプル間の楽音波形サンプルの位置を示すアドレスが生成される。したがって、この位置の楽音波形サンプルは、後述するように、音色データメモリ２５に記憶されている楽音波形サンプルから補間して求めるようにしている。
【００５５】
アドレス生成部２３によって生成された、実数値からなるアドレスのうち、整数部は、メモリ読み出し部２４に供給され、小数部は、補間部２６に供給される。
【００５６】
メモリ読み出し部２４は、供給されたアドレスの整数部に対応する楽音波形サンプルと、この楽音波形サンプルに隣接する、所定個の楽音波形サンプルとを音色データメモリ２５から読み出して、補間部２６に出力する。ここで、所定個は、補間部２６で採用されている補間方法に応じて決まり、その補間方法が、たとえば２点による直線補間であれば、所定個とは１個であり、この場合、メモリ読み出し部２４は、供給されたアドレスの整数部に対応する楽音波形サンプルと、その次のアドレスに位置する楽音波形サンプルとを読み出して、補間部２６に出力する。
【００５７】
補間部２６は、予め設定されている補間方法により、供給されたアドレスの小数部に基づいて、音色データメモリ２５から読み出された、複数の楽音波形サンプル間を補間し、目的の楽音波形サンプルを生成する。
【００５８】
補間部２６から出力された楽音波形サンプルは、加算部２７に供給される。加算部２７には、他に、ノイズフォルマント音発生部２８から出力されたノイズフォルマント音、より詳しくは、当該楽音波形サンプルに対応する、ノイズフォルマント音の一波形サンプルが供給され、加算部２７は、補間部２６からの楽音波形サンプルと、この楽音波形サンプルに対応する、ノイズフォルマント音の波形サンプルとを加算し、その加算結果をエンベロープ発生／付与部２９に出力する。
【００５９】
エンベロープ発生／付与部２９には、前記エンベロープデータが供給され、エンベロープ発生／付与部２９は、このエンベロープデータに基づいてエンベロープを発生させ、加算部２７からの楽音波形サンプルに付与した後、前記ＤＡＣに出力する。
【００６０】
図７は、上記ノイズフォルマント音発生部２８でなされるノイズフォルマント音発生処理の手順を示すブロック図であり、本ノイズフォルマント音発生処理は、前記公報に記載の方法を適用したものである。
【００６１】
図７において、ホワイトノイズ発生部３１から発生されたホワイトノイズは、ＬＰＦ（ロウパスフィルタ）３２を通過し、これにより、そのスペクトル特性が変更された後、乗算部３３を介して、正弦波発生部３４から発生されたフォルマント中心周波数Ｆｉ（ｉは、１〜４のいずれかの整数値）の正弦波と乗算される。
【００６２】
このようにして、乗算部３３からは、フォルマント中心周波数Ｆｉを中心としたフォルマント形状のノイズ音（ノイズフォルマント音）が出力され、乗算部３５に供給される。乗算部３５には、他に、パラメータ生成部２１が生成したフォルマントレベルＬｉが供給され、乗算部３５は、乗算部３３から出力された、所定レベル、たとえば単位レベルのノイズフォルマント音を、フォルマントレベルＬｉのノイズフォルマント音に変更する。
【００６３】
乗算部３５は、このフォルマントレベルＬｉのノイズフォルマント音をアキュムレータ３６に出力する。
【００６４】
ブロック３１〜３５は、時分割動作を行い、複数（本実施の形態では、４つ）のノイズフォルマント音を発生する。つまり、本実施の形態では、１フレーム分の楽音波形サンプルに対して４つのフォルマントを検出し、この４つのフォルマントからそれぞれフォルマント中心周波数およびフォルマントレベルを抽出するようにしたので、再現するフォルマントも４つにしている。
【００６５】
アキュムレータ３６は、時分割で順次供給されてくる、４つのノイズフォルマント音を累算して、所定のタイミングで加算部２７に出力する。なお、同一フレームの楽音波形サンプルに対しては、同一のノイズフォルマント音が、アキュムレータ３６から出力される。図８は、ノイズフォルマント音のレベルの遷移を示す図であり、アキュムレータ３６からは、同図（ａ）に示すように、１フレーム毎に異なった離散値が出力される。なお、（ａ）の出力に対して、たとえばフィルタリング処理を施すことで、同図（ｂ）に示すように、連続値が出力されるようにしてもよい。また、フォルマントデータの他のデータ、すなわち、フォルマント中心周波数に対しても、（ｂ）と同様の処理を施すようにしてもよい。
【００６６】
なお、本実施の形態では、フォルマントデータは、すべての楽音波形サンプルのうち、所定の複数個の楽音波形サンプル毎に抽出して記憶し、楽音を生成するときには、当該複数個の楽音波形サンプル毎にノイズフォルマント音を生成するようにしたが、これに限らず、フォルマントデータは、すべての楽音波形サンプルに共通のものを１組だけ抽出して記憶し、楽音を生成するときには、すべての楽音波形サンプルに共通のノイズフォルマント音を生成するようにしてもよい。この場合には、ノイズフォルマント音を付加すべき楽音のレベルを検出し、この検出レベルに応じて、ノイズフォルマント音に含まれるフォルマントの中心周波数やレベルを変更するようにした方が望ましい。また、検出する対象は、楽音のレベルに限らず、他の特性であってもよい。
【００６７】
一般的に、自然音を周波数軸上で見た場合、８ＫＨｚ付近よりも下の周波数のスペクトルは、その音そのものを認識する情報、つまり、ピアノやバイオリンの音、人声など、その音が何であるかを知るための情報を含んでいる。一方、およそ８ＫＨｚ以上から１６ＫＨｚ（可聴上限）のスペクトルは、その音の自然さを醸し出す情報を含むことが多い。仮に、この領域のスペクトルが完全に失われたとしても、その音が何であるかを認識できるものの、自然さは大いに欠けてしまう。本実施の形態のようにして、自然さを醸し出す情報、つまり８ＫＨｚから１６ＫＨｚまでの周波数スペクトルを近似的に回復するようにすれば、より自然な楽音を生成することができる。
【００６８】
また、本実施の形態では、本発明の楽音生成装置を携帯電話機、特にその音源装置に適用したが、これは、電子鍵盤楽器等に設けられる通常の音源装置では、サンプリング周波数より低い録音サンプリング周波数で楽音をサンプリングして、再現される楽音を劣化させてまで、音色データメモリ（波形メモリ）の記憶容量を低減する必要はないからである。したがって、逆に言えば、サンプリング周波数より低い録音サンプリング周波数で楽音をサンプリングする必要のある装置であれば、携帯電話機に限らない。
【００６９】
このように、本実施の形態では、サンプリング周波数より低い録音サンプリング周波数でサンプリングして、楽音波形データを生成したときに失われた周波数成分、すなわち録音サンプリング周波数の半分以上の周波数成分を、当該周波数域のフォルマントに応じたフォルマントデータであって、音色データメモリに記憶されたものに基づいて、前記失われたフォルマントに近似するフォルマントを備えたフォルマント音を合成し、前記生成された楽音波形データに加算することで、近似的に回復するようにしたので、より自然な楽音を生成することができる。
【００７０】
なお、上述した実施の形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、システムまたは装置に供給し、そのシステムまたは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読出し実行することによっても、本発明の目的が達成されることは言うまでもない。
【００７１】
この場合、記憶媒体から読出されたプログラムコード自体が本発明の新規な機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。
【００７２】
プログラムコードを供給するための記憶媒体としては、たとえば、フレキシブルディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、磁気テープ、不揮発性のメモリカード、ＲＯＭなどを用いることができる。
【００７３】
また、コンピュータが読出したプログラムコードを実行することにより、上述した実施の形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているＯＳなどが実際の処理の一部または全部を行い、その処理によって上述した実施の形態の機能が実現される場合も含まれることは言うまでもない。
【００７４】
【発明の効果】
以上説明したように、請求項１または３に記載の発明によれば、録音サンプリング周波数で、楽器の音をサンプリングして採取した楽音波形データを音色データメモリに記憶し、該音色データメモリから楽音波形データを読み出して楽音を生成するときに、前記録音サンプリング周波数でサンプリングしたことによって失われた周波数域のフォルマントに応じたフォルマントデータであって、音色データメモリに記憶されたものに基づいて、前記失われたフォルマントに近似するフォルマントを備えたフォルマント音が合成され、前記読み出された楽音に、前記合成されたフォルマント音が加算されるので、楽器の音を録音サンプリング周波数でサンプリングしたときに失われた周波数域のフォルマントが近似的に回復され、したがって、より自然な楽音を生成することができる。
【図面の簡単な説明】
【図１】本発明の一実施の形態に係る楽音生成装置を適用した携帯電話機の概略構成を示すブロック図である。
【図２】音色データメモリ作成処理の手順を示すフローチャートである。
【図３】図２の楽音波形サンプルの採取処理を説明するための図である。
【図４】基となる楽音波形サンプル、ダウンサンプリング後の楽音波形サンプルおよび検出後のフォルマントの各周波数スペクトルの様子を示す図である。
【図５】音色データメモリのメモリマップの一例を示す図である。
【図６】図１の波形テーブル音源でなされる楽音信号生成処理の手順を示すブロック図である。
【図７】図６のノイズフォルマント音発生部でなされるノイズフォルマント音発生処理の手順を示すブロック図である。
【図８】ノイズフォルマント音のレベルの遷移を示す図である。
【符号の説明】
６波形テーブル音源（楽音波形サンプル生成手段）、２５音色データメモリ、２８ノイズフォルマント音発生部（合成手段）、３１加算部（加算手段）、１０４メモリ（第１のメモリ、第２のメモリ） [0001]
BACKGROUND OF THE INVENTION
The present invention relates to a musical sound generating apparatus and a musical sound generating method for reading musical sound waveform data sampled and stored in a waveform memory and generating a musical sound based on the musical sound waveform data.
[0002]
[Prior art]
2. Description of the Related Art A so-called waveform table (wave table) sound source that samples a sound of a musical instrument, stores it in a waveform memory as musical sound waveform data, and reads out the musical sound waveform data to generate a musical sound is conventionally known.
[0003]
[Problems to be solved by the invention]
In the above conventional waveform table sound source, in order to reduce the storage capacity of the waveform memory, the original sampling frequency of the sound source system, specifically, the tone waveform data stored in the waveform memory is read to generate a tone. Musical sounds may be sampled at a frequency lower than the frequency. Hereinafter, the original sampling frequency of the sound source system is referred to as “sampling frequency”, and the sampling frequency when the musical sound is sampled to generate musical sound waveform data is referred to as “recording sampling frequency”.
[0004]
Since the recording sampling frequency corresponds to the number of musical sound waveform samples (individual waveform samples constituting the musical sound waveform data) per unit time (1 second), the lower the recording sampling frequency, The capacity is reduced, and as a result, the storage capacity of the waveform memory is reduced.
[0005]
However, according to the sampling theorem, the recording sampling frequency determines the upper limit frequency of the reproducible musical sound (= half of the recording sampling frequency) (for example, when the recording sampling frequency is 16 KHz, the frequency up to 8 KHz). In other words, if the recording sampling frequency is kept low, only the music with the higher harmonic components lost can be reproduced, and the sound quality of the music reproduced in this case will deteriorate. Become.
[0006]
The present invention has been made paying attention to this point, and even when the musical sound waveform data is generated with the recording sampling frequency lower than the sampling frequency, the lost harmonic component is recovered to make it more natural. An object of the present invention is to provide a musical sound generating apparatus and a musical sound generating method capable of generating various musical sounds.
[0007]
[Means for Solving the Problems]
  In order to achieve the above object, the claims2The musical tone generator described in the above section uses musical sound waveform data collected by sampling the sound of musical instruments at the recording sampling frequency.Tone data memoryRememberTone data memoryIn a musical sound generation device that reads musical sound waveform data from a musical sound and generates a musical sound,Higher than the recording sampling frequencySampling by frequencyObtained by down-sampling the first musical sound waveform data stored in the first memory at the recording sampling frequency, the first storage means for storing in the first memory as the first musical sound waveform data A second storage means for storing data in the second memory as second musical sound waveform data, and a frequency spectrum in a frequency range lost by down-sampling the first musical sound waveform data at the recording sampling frequency. Formant data generating means for generating the formant data based on the first musical sound waveform data, and the second musical sound waveform data and the formant data generated by the formant data generating means are associated with each other and stored in the timbre data memory. And a second storage means for storing the second storage means stored in the timbre data memory. And tone waveform samples generating means for generating a musical tone waveform samples from the sound waveform data, stored in the tone color data memory, associated with the second musical sound waveform dataBased on formant dataTehuA synthesis means for synthesizing formant sounds;Before the generated musical sound waveform sampleAdd the synthesized formant soundThen outputAnd adding means.
[0008]
  GoodPreferably, the formant data is a center frequency and a level of the formant, and the synthesizing means includes white noise generating means for generating white noise, and spectral characteristics of white noise generated by the white noise generating means. A sine wave generating means for generating a sine wave having the center frequency, a multiplying means for multiplying the output from the changing means by the sine wave generated by the sine wave generating means, and the multiplication And adjusting means for adjusting the output level from the means so as to be the level described above.
[0009]
  In order to achieve the above object, the claims1The musical sound generation method described inRecordMusical sound waveform data collected by sampling the sound of an instrument at the sound sampling frequencyTone data memoryRememberTone data memoryIn the musical sound generation method for generating musical sound by reading musical sound waveform data fromHigher than the recording sampling frequencySampling by frequencyThe first musical sound waveform data is stored in the first memory, and the data obtained by down-sampling the first musical sound waveform data stored in the first memory at the recording sampling frequency is stored in the second memory. Formant data is stored in the second memory as musical sound waveform data, and formant data based on a frequency spectrum in a frequency range lost by down-sampling the first musical sound waveform data at the recording sampling frequency is the first musical sound waveform data. And the second musical tone waveform data and the generated formant data are stored in the timbre data memory in association with each other, and when generating a musical tone, the second musical tone data stored in the timbre data memory is generated. A musical sound waveform sample is generated from the waveform data, and the second musical sound waveform data stored in the tone color data memory is stored. Associated with theBased on formant dataTehuSynthesizing formant sounds,Before the generated musical sound waveform sampleAdd the synthesized formant soundThen outputIt is characterized by doing.
[0010]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[0011]
FIG. 1 is a block diagram showing a schematic configuration of a mobile phone to which a musical sound generating device according to an embodiment of the present invention is applied.
[0012]
As shown in the figure, the control unit 1 is used for playing a CPU that controls the entire mobile phone, a control program executed by the CPU, a ROM that stores various table data, and a ringing melody. It is composed of performance data (for example, composed of MIDI (Musical Instrument Digital Interface) data), various input information, calculation results, and the like in RAM.
[0013]
The control unit 1 includes an operation input unit 2 having a numeric keypad and an operator for inputting various information, a display unit 3 having, for example, a color liquid crystal display (LCD) and a light emitting diode (LED), and analog audio. The signal is converted into a digital audio signal and then compressed, and conversely, the compressed digital audio signal is expanded and then converted into an analog audio signal, and the output signal from the audio CODEC 5 is modulated, A communication unit 4 that transmits to a relay station (not shown), receives and demodulates a signal transmitted from the relay station via the antenna 7, outputs the signal to the voice CODEC 5, and a tone color data memory (waveform memory) to be described later ), The target musical tone waveform data is read from the tone color data memory, and various processes are performed to generate a digital musical tone signal. A waveform table tone generator 6 for converting the analog musical tone signal is connected by C (Digital-to-Analog Converter).
[0014]
Connected to the audio CODEC 5 are a speaker 8 for converting an analog audio signal output from the audio CODEC 5 into sound and a microphone 9 for converting audio into an analog audio signal.
[0015]
The waveform table sound source 6 is connected to a speaker 10 for converting an analog musical sound signal output from the waveform table sound source 6 into sound.
[0016]
The feature of the present invention resides in the waveform table sound source 6, particularly the musical tone signal generation processing thereof. Specifically, since the musical tone waveform sample collected at a recording sampling frequency lower than the sampling frequency is stored in the timbre data memory, if the musical tone waveform sample is read as it is, a frequency component more than half of the recording sampling frequency is stored. Cannot be reproduced, that is, only musical tones with high-order harmonic components lost can be reproduced. In order to deal with this problem, in the original sound, a plurality of formants (four in this embodiment) appearing in a high frequency more than half of the recording sampling frequency are detected, and the formant center frequency and formant of each formant are detected. Each level is extracted and stored in the tone color data memory in association with the musical sound waveform sample. When the target musical sound waveform sample is read out and a musical sound is generated, each formant center frequency and formant level associated with the musical sound waveform sample are read out, and a plurality of formants of each formant center frequency and formant level are provided. Formant sounds are generated approximately and added to the target musical sound waveform sample.
[0017]
As described above, as a precondition for performing the musical tone signal generation process that is a feature of the present invention, the plurality of formants are detected, the formant center frequency and formant level of each formant are extracted, and the recording frequency is lower than the sampling frequency. A musical sound sample is generated by sampling a musical sound at a sampling frequency, and musical sound waveform data including the musical sound waveform sample and the corresponding formant center frequency and formant level must be registered in the timbre data memory. . Note that the same original sound must be sampled in both the processing for detecting the plurality of formants and the processing for generating a musical sound waveform sample. However, the sampling frequency used for sampling differs between the two, and the former needs to be sampled at a sampling frequency that is twice that of the latter. For this reason, in the present embodiment, the original musical sound is sampled only once at the former sampling frequency, that is, twice the recording sampling frequency, and the latter processing uses the downsampled one. I have to.
[0018]
Hereinafter, processing from sampling to registration in the timbre data memory, that is, timbre data memory creation processing will be described.
[0019]
FIG. 2 is a flowchart showing the procedure of the timbre data memory creation process. The real tone color data memory creation process is executed on a personal computer, for example.
[0020]
In the figure, first, a musical sound waveform sample is collected, that is, the original sound is sampled (step S1).
[0021]
FIG. 3 is a diagram for explaining the processing for collecting a musical sound waveform sample in step S 1, and shows a method for collecting a musical sound waveform sample of a cymbal sound by the personal computer 100.
[0022]
As shown in the figure, first, a cymbal sound is generated by hitting a cymbal with a stick, and this cymbal sound is converted into an analog musical sound signal by a microphone 101.
[0023]
Next, an LPF (low-pass filter) 102 that passes only frequency components equal to or lower than the recording sampling frequency Wfs cuts frequency components equal to or higher than Wfs included in the analog musical tone signal. In this way, the frequency component of Wfs or higher is cut because the musical sound having a frequency component of half (= Wfs) or more of the sampling frequency (= 2 Wfs: double the recording sampling frequency) for generating the original musical sound waveform sample is cut. Is not reproducible as described above, and therefore a frequency component higher than Wfs is not necessary.
[0024]
Next, the output signal from the LPF 102 is converted into a digital musical tone signal by the A / D converter 103 having a sampling frequency of 2 Wfs, and this digital musical tone signal is stored in a memory (for example, RAM) 104.
[0025]
When the above processing is repeated for a predetermined time, for example, 1 second, 2 Wfs musical sound waveform samples are stored in the memory 104. A time series sample after down-sampling by thinning out one sample from the time series of the musical sound waveform samples is a musical sound waveform sample to be stored in the timbre data memory.
[0026]
In addition, since the musical tone waveform samples of a plurality of timbres are registered in the timbre data memory, the musical tone waveform samples that are the basis of the musical instrument sounds other than the cymbal sounds are collected in the same manner as described above. That's fine.
[0027]
Returning to FIG. 2, the musical sound waveform sample collected in step S1 is processed and edited (step S2). Examples of the processing / editing process include a zero level (or low level) waveform sample deletion process, an amplitude adjustment process of each musical sound waveform sample, the downsampling process, and a high frequency analysis process of the underlying musical sound waveform sample. be able to.
[0028]
FIG. 4 is a diagram showing the frequency spectrum of the basic musical sound waveform sample, the down-sampled musical sound waveform sample, and the formant after detection. FIG. 4A shows the frequency spectrum of the basic musical sound waveform sample. An example is shown, (b) shows the frequency spectrum after down-sampling the musical sound waveform sample of (a) by 1/2, and (c) shows four formants detected from the musical sound waveform sample of (a). The frequency spectrum is shown.
[0029]
As described above, since the original musical sound waveform sample is generated by sampling the original sound at a frequency twice the recording sampling frequency Wfs, the frequency spectrum is maximum, as shown in FIG. Appears up to frequency Wfs.
[0030]
If the time series of the sound waveform samples that are the basis of this is downsampled by cutting out frequency components of Wfs / 2 or more by a downsampling filter and then thinning out one sample at a time, this is exactly the case when the original sound is sampled at the recording sampling frequency Wfs Therefore, the frequency spectrum of the musical sound waveform sample after the down-sampling is a frequency spectrum of Wfs / 2 to Wfs in the frequency spectrum of (a) as shown in (b). . The musical tone waveform sample after the down-sampling is registered in the timbre data memory as will be described later.
[0031]
When FFT (Fast Fourier Transform) is performed on a part (one frame in this embodiment) of musical sound waveform samples selected from the time series of the musical sound waveform samples as the base, the frequency spectrum of (a) (Because the frequency spectrum of (a) is the frequency spectrum obtained from all the time series of the underlying musical sound waveform samples as described above). In this frequency spectrum, four formants are detected by analyzing the frequency range of Wfs / 2 to Wfs. (C) shows the frequency spectrum of the detected formant. Then, the center frequency and level are extracted from each formant of the detected four formants. Specifically, in (c), formant center frequencies of F1 to F4 and formant levels of L1 to L4 are extracted. The extracted formant center frequency and formant level are stored in the timbre data memory in association with a musical sound waveform sample for one frame.
[0032]
In this embodiment, the formant center frequency and the formant level are extracted for each of a plurality of predetermined musical sound waveform samples out of all the musical sound waveform samples. Only one set common to samples may be extracted.
[0033]
Returning to FIG. 2, the musical sound waveform data processed and edited in step S2 is registered in the timbre data memory (step S3).
[0034]
FIG. 5 is a diagram showing an example of a memory map of the timbre data memory. In the present embodiment, timbre data for one timbre is composed of basic timbre data and musical tone waveform data. Since a plurality of timbre data are registered in the timbre data memory, a timbre number (No) (for example, a timbre number in the GM (General MIDI) system format) is assigned to the head address of the timbre data memory. A conversion table for converting to the start address of the area where the basic timbre data corresponding to the timbre number is stored is stored.
[0035]
The basic timbre data for one timbre includes the timbre name, the data length of the basic timbre data, the recording sampling frequency Wfs, the waveform start address indicating the address of the first sample of the musical sound waveform data, and the loop reading of the musical sound waveform data. Waveform loop start address indicating the address of the first sample in the area to be read, waveform loop end address indicating the address of the last sample in the area to be read in the loop, waveform end address indicating the address of the last sample of the sound waveform data, and the sound waveform It consists of envelope data and other data that determine the data envelope.
[0036]
In the present embodiment, the basic tone color data and the musical tone waveform data are stored in different areas, and therefore, in the basic tone color data, from which area the musical tone waveform data of the relevant tone is stored. Data indicating how far it is, that is, a waveform start address and a waveform end address are included. However, if it is configured to store the corresponding tone waveform data following the basic tone data, the start address of the tone waveform data can be calculated from the data length in the basic tone data, so the waveform start address is stored. You may make it not.
[0037]
On the other hand, the musical tone waveform data for one tone color was extracted by down-sampling the time series of the basic musical tone waveform sample as described above and the high frequency analysis for the musical tone waveform sample for one frame. Each formant is composed of formant data consisting of formant center frequencies and formant levels. One frame means a musical sound waveform sample for 10 msec, for example, and if the recording sampling frequency is Wfs, it corresponds to Wfs × 0.01 musical sound waveform samples.
[0038]
In the timbre data registration process in step S3, musical tone waveform data including the downsampled musical tone waveform sample and the extracted formant data is registered so as to conform to the format of the timbre data memory of FIG. Specifically, the musical sound waveform data is stored in the musical sound waveform data storage area, the basic timbre data is created and stored in the basic timbre data storage area, and the head address of the basic timbre data is set to the timbre number. → Store in the corresponding position in the start address conversion table.
[0039]
In the real tone data registration processing, the memory of the personal computer 100 (may be the same as the memory 104 (however, the area needs to be different in this case) or may be a different storage medium). An area having the same capacity as the timbre data memory of FIG. 5 is generated above, and the musical tone waveform data is registered in this area.
[0040]
In the subsequent step S4, it is determined whether or not there is a tone of another tone color to be sampled. If there is, the process returns to step S1 and repeats from the waveform sampling process. If not, the process proceeds to step S5.
[0041]
In step S5, the same memory map as the timbre data registered in the predetermined area of the memory is generated on a timbre data memory comprising, for example, a ROM, that is, converted into a timbre data memory.
[0042]
The outline of the control process executed by the mobile phone configured as described above will be described first, and then will be described in detail with reference to FIGS.
[0043]
In order to reduce the storage capacity of the timbre data memory, the cellular phone according to the present embodiment collects a musical sound waveform sample at a recording sampling frequency lower than the sampling frequency, stores it in the timbre data memory, and is lost thereby. High formant data (formant center frequency and formant level) from half the recording sampling frequency to the recording sampling frequency is extracted and stored in the tone color data memory in association with the musical sound waveform sample. When generating a musical tone, the target musical sound waveform sample is read from the timbre data memory, the formant data corresponding to the musical sound waveform sample is also read, and a formant having a plurality of formants, which is generated approximately based on the formant data. A sound signal is generated by adding a sound to the previously read musical sound waveform sample. As a result, the frequency component that is lost when the musical sound waveform sample is sampled, that is, the formant, which is more than half of the recording sampling frequency, is approximately recovered, so that a more natural musical sound can be generated.
[0044]
Here, as a method for generating formant sound from formant data, the method described in Japanese Patent Application Laid-Open No. Hei 2-271597 filed earlier by the present applicant is used as it is. Therefore, a detailed description thereof will be given in the publication, but will be described later within the scope necessary for carrying out the present invention.
[0045]
Next, this control process will be described in detail. Since the feature of the present invention is exclusively in the tone signal generation process, that is, the tone signal generation process in the waveform table tone generator 6 as described above, hereinafter, the control process (the tone signal generation process) performed in the waveform table tone generator 6 will be described. Processing) will be described.
[0046]
FIG. 6 is a block diagram showing a procedure of a musical sound signal generation process performed by the waveform table sound source 6. Since the waveform table sound source 6 is usually configured by a DSP (digital signal processor), most of the control processing is performed by software. Of course, all of the waveform table sound source 6 may be configured by hardware.
[0047]
In FIG. 6, MIDI data indicating a tone to be generated is input to the parameter generation unit 21, and a memory reading unit 24 that reads the stored contents of the timbre data memory 25 is connected to the parameter generation unit 21.
[0048]
When the MIDI data is input, the parameter generation unit 21 first analyzes the MIDI data and extracts information such as a tone color number and a key code, that is, information necessary for generating a musical tone generation parameter. Next, the parameter generation unit 21 accesses the head address of the timbre data memory 25 via the memory reading unit 24, and uses the timbre number → start address conversion table stored at that position to extract the above-described extraction. The head address of the basic tone color data corresponding to the selected tone color number is searched, the basic tone color data for one tone color stored at that position is read, and the tone generation parameter is generated based on the basic tone color data.
[0049]
The parameters generated at this time are specifically a waveform start address, a waveform end address, a waveform loop address (waveform loop start address and waveform loop end address), and envelope data. All of these parameters are data itself included in the basic tone color data.
[0050]
The parameter generation unit 21 also includes an F number (FN) that is a parameter representing the amount of advance of an address per musical tone waveform sample based on the extracted tone number and key code, the recording sampling frequency, and the sampling frequency. And an OCT parameter which is a parameter indicating octave.
[0051]
Further, the parameter generation unit 21 generates formant data (formant center frequency and formant level). As described above, since the formant data is stored differently for each frame, when the reading of the musical sound waveform sample for one frame is completed, the parameter generation unit 21 passes through the memory reading unit 24. The formant data corresponding to the next frame may be read from the timbre data memory 25 and generated.
[0052]
The various clock generators 22 generate and output clocks having a plurality of frequencies by, for example, dividing the supplied basic clock. The main clock is a clock of the sampling frequency Sfs inherent to the waveform table sound source 6, and the operation of each block 23 to 29 of the waveform table sound source 6 proceeds by this clock.
[0053]
The address generation unit 23 receives the F number, OCT parameter, waveform start address, waveform end address, and waveform loop address generated by the parameter generation unit 21. The address generation unit 23, based on these parameters, An address of a musical sound waveform sample to be read from the timbre data memory 25 is generated.
[0054]
In the present invention, since the recording sampling frequency is lower than the sampling frequency, the address generated by the address generator 23 is usually a real value including a decimal part. That is, the address generator 23 generates an address indicating the position of the musical sound waveform sample between the musical sound waveform samples stored in the timbre data memory 25. Therefore, the musical sound waveform sample at this position is obtained by interpolation from the musical sound waveform samples stored in the timbre data memory 25, as will be described later.
[0055]
Of the addresses composed of real values generated by the address generation unit 23, the integer part is supplied to the memory reading unit 24 and the decimal part is supplied to the interpolation unit 26.
[0056]
The memory readout unit 24 reads out the musical sound waveform sample corresponding to the integer part of the supplied address and a predetermined number of musical sound waveform samples adjacent to the musical sound waveform sample from the timbre data memory 25 and outputs them to the interpolation unit 26. To do. Here, the predetermined number is determined according to the interpolation method employed in the interpolation unit 26. If the interpolation method is, for example, linear interpolation using two points, the predetermined number is one. The reading unit 24 reads out the musical sound waveform sample corresponding to the integer part of the supplied address and the musical sound waveform sample located at the next address, and outputs them to the interpolation unit 26.
[0057]
The interpolation unit 26 interpolates between the plurality of musical sound waveform samples read from the timbre data memory 25 on the basis of the decimal part of the supplied address by a preset interpolation method to obtain the target musical sound waveform sample. Is generated.
[0058]
The musical sound waveform sample output from the interpolation unit 26 is supplied to the addition unit 27. In addition, the noise formant sound output from the noise formant sound generation unit 28, more specifically, one waveform sample of the noise formant sound corresponding to the musical sound waveform sample is supplied to the adding unit 27. The musical sound waveform sample from the interpolation unit 26 and the noise formant sound waveform sample corresponding to the musical sound waveform sample are added, and the addition result is output to the envelope generation / giving unit 29.
[0059]
The envelope generation / giving unit 29 is supplied with the envelope data. The envelope generation / granting unit 29 generates an envelope based on the envelope data, adds the envelope to the musical sound waveform sample from the adding unit 27, and then the DAC. Output to.
[0060]
FIG. 7 is a block diagram showing the procedure of the noise formant sound generation process performed by the noise formant sound generation unit 28. The noise formant sound generation process applies the method described in the above publication.
[0061]
In FIG. 7, the white noise generated from the white noise generation unit 31 passes through an LPF (low pass filter) 32, thereby changing its spectral characteristics, and then generating a sine wave via the multiplication unit 33. Multiply by the sine wave of formant center frequency Fi (i is an integer value of 1 to 4) generated from the unit 34.
[0062]
In this way, the formant-shaped noise sound (noise formant sound) centered on the formant center frequency Fi is output from the multiplier 33 and supplied to the multiplier 35. In addition to this, the formant level Li generated by the parameter generation unit 21 is supplied to the multiplication unit 35, and the multiplication unit 35 outputs the noise formant sound of a predetermined level, for example, unit level, output from the multiplication unit 33 to the formant level. Change to the noise formant sound of Li.
[0063]
The multiplier 35 outputs the noise formant sound having the formant level Li to the accumulator 36.
[0064]
Blocks 31 to 35 perform a time-sharing operation to generate a plurality (four in this embodiment) of noise formant sounds. In other words, in the present embodiment, four formants are detected for a musical sound waveform sample for one frame, and the formant center frequency and formant level are extracted from the four formants, respectively. I'm stuck.
[0065]
The accumulator 36 accumulates four noise formant sounds that are sequentially supplied in a time division manner, and outputs the accumulated noise formant sound to the adder 27 at a predetermined timing. Note that the same noise formant sound is output from the accumulator 36 for musical tone waveform samples of the same frame. FIG. 8 is a diagram showing the transition of the level of the noise formant sound, and the accumulator 36 outputs different discrete values for each frame as shown in FIG. Note that, for example, by performing a filtering process on the output of (a), a continuous value may be output as shown in FIG. Further, the same processing as in (b) may be applied to other data of formant data, that is, the formant center frequency.
[0066]
In the present embodiment, the formant data is extracted and stored for each of a plurality of predetermined musical sound waveform samples among all the musical sound waveform samples, and when generating a musical sound, However, the present invention is not limited to this, but only one set of formant data is extracted and stored for all musical sound waveform samples, and when generating musical sounds, all musical sound waveforms are generated. A noise formant sound common to the samples may be generated. In this case, it is desirable to detect the level of the musical sound to which the noise formant sound should be added, and to change the center frequency and level of the formant included in the noise formant sound in accordance with this detection level. Further, the target to be detected is not limited to the tone level but may be other characteristics.
[0067]
In general, when natural sounds are viewed on the frequency axis, the spectrum of frequencies below 8 KHz is information that recognizes the sound itself, that is, the sound such as piano and violin sounds, human voices, etc. Contains information to know if it exists. On the other hand, the spectrum of about 8 KHz or more to 16 KHz (audible upper limit) often includes information that brings out the naturalness of the sound. Even if the spectrum of this region is completely lost, it is possible to recognize what the sound is, but the naturalness is greatly lost. If the information that brings out the naturalness, that is, the frequency spectrum from 8 KHz to 16 KHz is approximately recovered as in the present embodiment, a more natural musical tone can be generated.
[0068]
Further, in the present embodiment, the musical sound generating device of the present invention is applied to a mobile phone, particularly a sound source device thereof. However, in a normal sound source device provided in an electronic keyboard instrument or the like, a recording sampling frequency lower than the sampling frequency is used. This is because it is not necessary to reduce the storage capacity of the timbre data memory (waveform memory) until the musical sound is sampled and the reproduced musical sound is degraded. Therefore, in other words, the apparatus is not limited to a mobile phone as long as it is a device that needs to sample a musical sound at a recording sampling frequency lower than the sampling frequency.
[0069]
As described above, in this embodiment, the frequency component lost when the musical sound waveform data is generated by sampling at the recording sampling frequency lower than the sampling frequency, that is, the frequency component more than half of the recording sampling frequency, Based on formant data corresponding to the formant of the region and stored in the timbre data memory, a formant sound having a formant that approximates the lost formant is synthesized, and the generated musical sound waveform data is synthesized. By performing the addition, it is possible to generate a more natural musical tone because it is approximately recovered.
[0070]
A program in which a storage medium storing software program codes for realizing the functions of the above-described embodiments is supplied to a system or apparatus, and a computer (or CPU or MPU) of the system or apparatus is stored in the storage medium. It goes without saying that the object of the present invention can also be achieved by reading and executing the code.
[0071]
In this case, the program code itself read from the storage medium realizes the novel function of the present invention, and the storage medium storing the program code constitutes the present invention.
[0072]
As a storage medium for supplying the program code, for example, a flexible disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a magnetic tape, a nonvolatile memory card, a ROM, or the like can be used.
[0073]
Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also the OS running on the computer based on the instruction of the program code performs the actual processing. It goes without saying that a case where the functions of the above-described embodiment are realized by performing part or all of the above and the processing thereof is included.
[0074]
【The invention's effect】
  As described above, according to the invention described in claim 1 or 3,, RecordMusical sound waveform data collected by sampling the sound of an instrument at the sound sampling frequencyTone dataStore it in memoryTone dataWhen the musical sound waveform data is read from the memory to generate the musical sound, formant data corresponding to the formant of the frequency range lost by sampling at the recording sampling frequency,Tone data memoryThe formant sound having a formant that approximates the lost formant is synthesized based on the stored formant, and the synthesized formant sound is added to the read musical sound. The formant in the frequency range lost when sampling is recorded at the recording sampling frequency is approximately recovered, so that a more natural musical tone can be generated.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a schematic configuration of a mobile phone to which a musical sound generating device according to an embodiment of the present invention is applied.
FIG. 2 is a flowchart showing a procedure of timbre data memory creation processing;
FIG. 3 is a diagram for explaining a musical sound waveform sample collection process of FIG. 2;
FIG. 4 is a diagram showing a state of each frequency spectrum of a basic musical sound waveform sample, a musical sound waveform sample after downsampling, and a formant after detection;
FIG. 5 is a diagram showing an example of a memory map of a timbre data memory.
6 is a block diagram showing a procedure of a musical tone signal generation process performed by the waveform table sound source of FIG. 1. FIG.
7 is a block diagram showing a procedure of noise formant sound generation processing performed by the noise formant sound generator of FIG. 6; FIG.
FIG. 8 is a diagram illustrating a transition of the level of a noise formant sound.
[Explanation of symbols]
6 waveform table sound source (musical sound waveform sample generation means),25 Tone Data MemoRe28 Noise formant sound generator (synthesizer), 31 Adder (adder), 104 memory (first memory, second memory)

Claims

In Recording sampling frequency, stores the musical sound waveform data collected by sampling sound from the instrument to the tone color data memory, in the tone generating method of generating a musical tone from the tone color data memory reads musical tone waveform data,
The sound of the instrument is sampled and collected at a frequency higher than the recording sampling frequency, and stored in the first memory as first musical sound waveform data,
Storing the data obtained by down-sampling the first musical sound waveform data stored in the first memory at the recording sampling frequency in the second memory as second musical sound waveform data;
Generating formant data from the first musical sound waveform data based on a frequency spectrum in a frequency range lost by down-sampling the first musical sound waveform data at the recording sampling frequency;
Storing the second musical tone waveform data and the generated formant data in association with each other in the timbre data memory;
When generating music,
Generating a musical sound waveform sample from the second musical sound waveform data stored in the timbre data memory;
The tone color is data stored in memory, to synthesize a formant sound on the basis of the formant data associated with the second musical sound waveform data,
Tone generating method characterized by adding and outputting pre Symbol synthesized formant sound on the generated tone waveform samples.

In recording sampling frequency, stores the musical sound waveform data collected by sampling sound from the instrument to the tone color data memory, the tone generation device that generates a musical tone by reading the musical tone waveform data from the timbre data memory,
First storage means for sampling and collecting the sound of the instrument at a frequency higher than the recording sampling frequency, and storing the sampled sound as first musical sound waveform data in a first memory;
Second storage means for storing data obtained by down-sampling the first musical sound waveform data stored in the first memory at the recording sampling frequency as second musical sound waveform data in the second memory. When,
Formant data generation means for generating formant data from the first musical sound waveform data based on a frequency spectrum in a frequency range lost by down-sampling the first musical sound waveform data at the recording sampling frequency;
Third storage means for associating and storing the second musical sound waveform data and the formant data generated by the formant data generation means in the timbre data memory;
A musical sound waveform sample generating means for generating a musical sound waveform sample from the second musical sound waveform data stored in the timbre data memory;
Synthesizing means for synthesizing a formant sound the tone color is data stored in memory, based on the formant data associated with the second musical sound waveform data,
Tone generation apparatus characterized by having an adding means for adding and outputting pre Symbol synthesized formant sound on the generated tone waveform samples.

The formant data is the center frequency of the formant and its level,
The synthesis means includes
White noise generating means for generating white noise;
Changing means for changing the spectral characteristics of white noise generated by the white noise generating means;
Sine wave generating means for generating a sine wave of the center frequency;
Multiplying means for multiplying the output from the changing means by the sine wave generated by the sine wave generating means;
3. The musical tone generating apparatus according to claim 2 , further comprising an adjusting unit that adjusts the level of the output from the multiplying unit to be the level.