JP4580548B2

JP4580548B2 - Frequency analysis method

Info

Publication number: JP4580548B2
Application number: JP2000397378A
Authority: JP
Inventors: 敏雄茂出木
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2000-12-27
Filing date: 2000-12-27
Publication date: 2010-11-17
Anticipated expiration: 2020-12-27
Also published as: JP2002196796A

Abstract

PROBLEM TO BE SOLVED: To provide a frequency analysis method with which a sound signal obtained from the performance sound recording or the like can be correctly reproduced, and an encoding method of the sound signal. SOLUTION: A time series signal is extracted as a section signal (S3), correlation between the section signal and two ore more standard frequencies is determined and stored (S4), then the greatest correlation value in the correlation array is selected, and is added to the intensity value of the corresponding standard frequency in the intensity array (S6). Then, by subtracting the containing signal generated by the product of the selected maximum correlation value and the corresponding standard periodic function from the section signal, it is made to a new section signal (S7), a step S4 is performed, and is updated as a correlation value of the standard frequency (S8). The intensity array obtained by repeating and performing steps S6-S8 is determined as the intensity value for each of all the standard frequencies in the unit section.

Description

【０００１】
【産業上の利用分野】
本発明は、放送メディア（ラジオ、テレビ）、通信メディア（ＣＳ映像・音声配信、インターネット音楽配信、通信カラオケ）、パッケージメディア（ＣＤ、ＭＤ、カセット、ビデオ、ＬＤ、ＣＤ−ＲＯＭ、ゲームカセット、携帯音楽プレーヤ向け固体メモリ媒体）などで提供する各種オーディオコンテンツの制作、並びに、専用携帯音楽プレーヤ、携帯電話・ＰＨＳ・ポケベルなどに向けたボーカルを含む音楽コンテンツ、歌舞伎・能・読経・詩歌など文芸作品の音声素材または語学教育音声教材のＭＩＤＩ伝送に利用するのに好適な音響信号の符号化技術に関する。
【０００２】
【従来の技術】
音響信号に代表される時系列信号には、その構成要素として複数の周期信号が含まれている。このため、与えられた時系列信号にどのような周期信号が含まれているかを解析する手法は、古くから知られている。例えば、フーリエ解析は、与えられた時系列信号に含まれる周波数成分を解析するための方法として広く利用されている。
【０００３】
このような時系列信号の解析方法を利用すれば、音響信号を符号化することも可能である。コンピュータの普及により、原音となるアナログ音響信号を所定のサンプリング周波数でサンプリングし、各サンプリング時の信号強度を量子化してデジタルデータとして取り込むことが容易にできるようになってきており、こうして取り込んだデジタルデータに対してフーリエ解析などの手法を適用し、原音信号に含まれていた周波数成分を抽出すれば、各周波数成分を示す符号によって原音信号の符号化が可能になる。
【０００４】
一方、電子楽器による楽器音を符号化しようという発想から生まれたＭＩＤＩ（Musical Instrument Digital Interface）規格も、パーソナルコンピュータの普及とともに盛んに利用されるようになってきている。このＭＩＤＩ規格による符号データ（以下、ＭＩＤＩデータという）は、基本的には、楽器のどの鍵盤キーを、どの程度の強さで弾いたか、という楽器演奏の操作を記述したデータであり、このＭＩＤＩデータ自身には、実際の音の波形は含まれていない。そのため、実際の音を再生する場合には、楽器音の波形を記憶したＭＩＤＩ音源が別途必要になるが、その符号化効率の高さが注目を集めており、ＭＩＤＩ規格による符号化および復号化の技術は、現在、パーソナルコンピュータを用いて楽器演奏、楽器練習、作曲などを行うソフトウェアに広く採り入れられている。
【０００５】
そこで、音響信号に代表される時系列信号に対して、所定の手法で解析を行うことにより、その構成要素となる周期信号を抽出し、抽出した周期信号をＭＩＤＩデータを用いて符号化しようとする提案がなされている。例えば、特開平１０−２４７０９９号公報、特開平１１−７３１９９号公報、特開平１１−７３２００号公報、特開平１１−９５７５３号公報、特開２０００−９９００９号公報、特開２０００−９９０９２号公報、特開２０００−９９０９３号公報、特開２０００−２６１３２２号公報、特願平１１−１７７８７５号明細書、特願平１１−３２９２９７号明細書、特願２０００−６８５２１号明細書には、任意の時系列信号について、構成要素となる周波数を解析し、その解析結果からＭＩＤＩデータを作成することができる種々の方法が提案されている。
【０００６】
【発明が解決しようとする課題】
上記各公報または明細書において提案してきたＭＩＤＩ符号化方式により、演奏録音等から得られる音響信号の効率的な符号化が可能になった。特に、特開２０００−２６１３２２号公報に記載の符号化方法では、一般化調和解析を用いることにより、それまでの短時間フーリエ変換を用いた手法に比較して、周波数分解能を著しく向上させることに成功した。この一般化調和解析による手法は、短時間フーリエ変換で計算上発生する擬似周波数成分を抑圧する効果がある。さらに、特願２０００−６８５２１号明細書においては、短時間フーリエ変換、一般化調和解析双方の結果を比較することにより真の周波数成分を抽出する方法を提案している。
【０００７】
上記先願発明における一般化調和解析を利用した符号化方法は、与えられた音響信号から相関が最大となる標準周波数の含有信号成分を１つづつ削除していくことにより、原音響信号を近似する含有信号成分の集合を決定し、各含有信号に対応する符号データの集合により、原音響信号を表現するものである。上記先願発明においては、解析対象となる音響信号から含有信号成分を削除した後、当該含有成分が有する標準周波数は、次回以降の含有信号成分として抽出する対象から外されるため、既に抽出された含有信号成分と同一周波数である含有信号成分については、本来抽出すべきものであっても、抽出されなかった。このような状態で符号化を行なうと、本来の演奏からは、音程が数半音程度ずれてしまうこともある。
【０００８】
上記のような点に鑑み、本発明は、演奏録音等から得られる音響信号を正確に再現することが可能な周波数解析方法および音響信号の符号化方法を提供することを課題とする。
【０００９】
【課題を解決するための手段】
上記課題を解決するため、本発明では、解析すべき周波数を標準周波数として設定すると共に、各標準周波数に対応する周期関数を標準周期関数として準備し、各標準周波数に対応する相関値を格納するための相関配列、および前記各標準周波数に対応する強度値を格納するための強度配列を準備し、与えられた音響信号の時間軸上に複数の単位区間を設定し、各単位区間ごとの音響信号を区間信号として抽出し、抽出された区間信号に対して、複数の標準周期関数との相関を求めることにより、各標準周波数に対応する相関値を算出し、相関配列に各相関値を格納し、強度配列の全ての値を０に設定し、相関配列中で最大の相関値を選出し、強度配列中の対応する標準周波数の強度値に加算し、選出された最大相関値と対応する標準周期関数の積で生成される含有信号を前記区間信号から減じることにより差分信号を算出し、算出された差分信号を新たな区間信号とすることにより区間信号を更新し、区間信号と含有信号の構成要素となった標準周期関数との相関値を算出し、算出された相関値を対応する標準周波数の相関値として更新した後、当該更新された相関配列中で最大の相関値に対応する標準周期関数を選出し、区間信号と当該標準周期関数との相関値を対応する標準周波数の相関値として更新し、前記強度値の加算、前記区間信号の更新、前記相関値の更新を繰り返し実行することにより得られる強度配列を、その単位区間における全標準周波数各々に対応する強度値として決定する。さらに、本発明ではこのような強度配列の決定を、全単位区間に対して行なうことにより、音響信号の全単位区間に対する強度配列を得るようにしたことを特徴とする。本発明によれば、各標準周波数に対応した相関値、強度値を格納する相関配列、強度配列を用意しておき、相関値が最大となった標準周波数について、その相関値をそのまま、その標準周波数の強度情報とはせずに、強度値として強度配列に格納しておき、再び同一の標準周波数の相関値が最大となった場合に、その相関値を強度値として加算するようにしたので、音響信号における各周波数成分の強度情報をより正確に求めることが可能となり、複数の楽器音が混在するような音響信号に対しても、原音により忠実な再現を行なうことが可能となる。
【００１０】
【発明の実施の形態】
以下、本発明の実施形態について図面を参照して詳細に説明する。
【００１１】
（周波数解析、音響信号符号化方法の基本原理）
はじめに、本発明に係る周波数解析方法、音響信号の符号化方法の基本原理を述べておく。この基本原理は、前掲の各公報あるいは明細書に開示されているので、ここではその概要のみを簡単に述べることにする。
【００１２】
図１（ａ）に示すように、時系列信号としてアナログ音響信号が与えられたものとする。図１の例では、横軸に時間ｔ、縦軸に振幅（強度）をとって、この音響信号を示している。ここでは、まずこのアナログ音響信号を、デジタルの音響データとして取り込む処理を行う。これは、従来の一般的なＰＣＭの手法を用い、所定のサンプリング周波数でこのアナログ音響信号をサンプリングし、振幅を所定の量子化ビット数を用いてデジタルデータに変換する処理を行えば良い。ここでは、説明の便宜上、ＰＣＭの手法でデジタル化した音響データの波形も図１（ａ）のアナログ音響信号と同一の波形で示すことにする。
【００１３】
続いて、この解析対象となる音響信号の時間軸上に、複数の単位区間を設定する。図１（ａ）に示す例では、時間軸ｔ上に等間隔に６つの時刻ｔ１〜ｔ６が定義され、これら各時刻を始点および終点とする５つの単位区間ｄ１〜ｄ５が設定されている。図１の例では、全て同一の区間長をもった単位区間が設定されているが、個々の単位区間ごとに区間長を変えるようにしてもかまわない。あるいは、隣接する単位区間が時間軸上で部分的に重なり合うような区間設定を行ってもかまわない。
【００１４】
こうして単位区間が設定されたら、各単位区間ごとの音響信号（以下、区間信号と呼ぶことにする）について、それぞれ代表周波数を選出する。各区間信号には、通常、様々な周波数成分が含まれているが、例えば、その中で成分の強度割合の大きな周波数成分を代表周波数として選出すれば良い。ここで、代表周波数とはいわゆる基本周波数が一般的であるが、音声のフォルマント周波数などの倍音周波数や、ノイズ音源のピーク周波数も代表周波数として扱うことがある。代表周波数は１つだけ選出しても良いが、音響信号によっては複数の代表周波数を選出した方が、より精度の高い符号化が可能になる。図１（ｂ）には、個々の単位区間ごとにそれぞれ３つの代表周波数を選出し、１つの代表周波数を１つの代表符号（図では便宜上、音符として示してある）として符号化した例が示されている。ここでは、代表符号（音符）を収容するために３つのトラックＴ１，Ｔ２，Ｔ３が設けられているが、これは個々の単位区間ごとに選出された３つずつの代表符号を、それぞれ異なるトラックに収容するためである。
【００１５】
例えば、単位区間ｄ１について選出された代表符号ｎ（ｄ１，１），ｎ（ｄ１，２），ｎ（ｄ１，３）は、それぞれトラックＴ１，Ｔ２，Ｔ３に収容されている。ここで、各符号ｎ（ｄ１，１），ｎ（ｄ１，２），ｎ（ｄ１，３）は、ＭＩＤＩ符号におけるノートナンバーを示す符号である。ＭＩＤＩ符号におけるノートナンバーは、０〜１２７までの１２８通りの値をとり、それぞれピアノの鍵盤の１つのキーを示すことになる。具体的には、例えば、代表周波数として４４０Ｈｚが選出された場合、この周波数はノートナンバーｎ＝６９（ピアノの鍵盤中央の「ラ音（Ａ３音）」に対応）に相当するので、代表符号としては、ｎ＝６９が選出されることになる。もっとも、図１（ｂ）は、上述の方法によって得られる代表符号を音符の形式で示した概念図であり、実際には、各音符にはそれぞれ強度に関するデータも付加されている。例えば、トラックＴ１には、ノートナンバーｎ（ｄ１，１），ｎ（ｄ２，１）・・・なる音高を示すデータとともに、ｅ（ｄ１，１），ｅ（ｄ２，１）・・・なる強度を示すデータが収容されることになる。この強度を示すデータは、各代表周波数の成分が、元の区間信号にどの程度の度合いで含まれていたかによって決定される。具体的には、各代表周波数をもった周期関数の区間信号に対する相関値に基づいて強度を示すデータが決定されることになる。また、図１（ｂ）に示す概念図では、音符の横方向の位置によって、個々の単位区間の時間軸上での位置が示されているが、実際には、この時間軸上での位置を正確に数値として示すデータが各音符に付加されていることになる。
【００１６】
音響信号を符号化する形式としては、必ずしもＭＩＤＩ形式を採用する必要はないが、この種の符号化形式としてはＭＩＤＩ形式が最も普及しているため、実用上はＭＩＤＩ形式の符号データを用いるのが好ましい。ＭＩＤＩ形式では、「ノートオン」データもしくは「ノートオフ」データが、「デルタタイム」データを介在させながら存在する。「ノートオン」データは、特定のノートナンバーＮとベロシティーＶを指定して特定の音の演奏開始を指示するデータであり、「ノートオフ」データは、特定のノートナンバーＮとベロシティーＶを指定して特定の音の演奏終了を指示するデータである。また、「デルタタイム」データは、所定の時間間隔を示すデータである。ベロシティーＶは、例えば、ピアノの鍵盤などを押し下げる速度（ノートオン時のベロシティー）および鍵盤から指を離す速度（ノートオフ時のベロシティー）を示すパラメータであり、特定の音の演奏開始操作もしくは演奏終了操作の強さを示すことになる。
【００１７】
前述の方法では、第ｉ番目の単位区間ｄｉについて、代表符号としてＪ個のノートナンバーｎ（ｄｉ，１），ｎ（ｄｉ，２），・・・，ｎ（ｄｉ，Ｊ）が得られ、このそれぞれについて強度ｅ（ｄｉ，１），ｅ（ｄｉ，２），・・・，ｅ（ｄｉ，Ｊ）が得られる。そこで、次のような手法により、ＭＩＤＩ形式の符号データを作成することができる。まず、「ノートオン」データもしくは「ノートオフ」データの中で記述するノートナンバーＮとしては、得られたノートナンバーｎ（ｄｉ，１），ｎ（ｄｉ，２），・・・，ｎ（ｄｉ，Ｊ）をそのまま用いれば良い。一方、「ノートオン」データもしくは「ノートオフ」データの中で記述するベロシティーＶとしては、得られた強度ｅ（ｄｉ，１），ｅ（ｄｉ，２），・・・，ｅ（ｄｉ，Ｊ）を所定の方法で規格化した値を用いれば良い。また、「デルタタイム」データは、各単位区間の長さに応じて設定すれば良い。
【００１８】
（周期関数との相関を求める具体的な方法）
上述した基本原理の基づく方法では、区間信号に対して、１つまたは複数の代表周波数が選出され、この代表周波数をもった周期信号によって、当該区間信号が表現されることになる。ここで、選出される代表周波数は、文字どおり、当該単位区間内の信号成分を代表する周波数である。この代表周波数を選出する具体的な方法には、後述するように、短時間フーリエ変換を利用する方法と、一般化調和解析の手法を利用する方法とがある。いずれの方法も、基本的な考え方は同じであり、あらかじめ周波数の異なる複数の周期関数を用意しておき、これら複数の周期関数の中から、当該単位区間内の区間信号に対する相関が高い周期関数を見つけ出し、この相関の高い周期関数の周波数を代表周波数として選出する、という手法を採ることになる。すなわち、代表周波数を選出する際には、あらかじめ用意された複数の周期関数と、単位区間内の区間信号との相関を求める演算を行うことになる。そこで、ここでは、周期関数との相関を求める具体的な方法を述べておく。
【００１９】
複数の周期関数として、図２に示すような三角関数が用意されているものとする。これらの三角関数は、同一周波数をもった正弦関数と余弦関数との対から構成されており、１２８通りの標準周波数ｆ（０）〜ｆ（１２７）のそれぞれについて、正弦関数および余弦関数の対が定義されていることになる。ここでは、同一の周波数をもった正弦関数および余弦関数からなる一対の関数を、当該周波数についての周期関数として定義することにする。すなわち、ある特定の周波数についての周期関数は、一対の正弦関数および余弦関数によって構成されることになる。このように、一対の正弦関数と余弦関数とにより周期関数を定義するのは、信号に対する周期関数の相関値を求める際に、相関値が位相の影響を受ける事を考慮するためである。なお、図２に示す各三角関数内の変数Ｆおよびｋは、区間信号Ｘについてのサンプリング周波数Ｆおよびサンプル番号ｋに相当する変数である。例えば、周波数ｆ（０）についての正弦波は、ｓｉｎ（２πｆ（０）ｋ／Ｆ）で示され、任意のサンプル番号ｋを与えると、区間信号を構成する第ｋ番目のサンプルと同一時間位置における周期関数の振幅値が得られる。
【００２０】
ここでは、１２８通りの標準周波数ｆ（０）〜ｆ（１２７）を図３に示すような式で定義した例を示すことにする。すなわち、第ｎ番目（０≦ｎ≦１２７）の標準周波数ｆ（ｎ）は、以下の（数式１）で定義されることになる。
【００２１】
（数式１）
ｆ（ｎ）＝４４０×２^γ ⁽ⁿ⁾
γ（ｎ）＝（ｎ−６９）／１２
【００２２】
このような式によって標準周波数を定義しておくと、最終的にＭＩＤＩデータを用いた符号化を行う際に便利である。なぜなら、このような定義によって設定される１２８通りの標準周波数ｆ（０）〜ｆ（１２７）は、等比級数をなす周波数値をとることになり、ＭＩＤＩデータで利用されるノートナンバーに対応した周波数になるからである。したがって、図２に示す１２８通りの標準周波数ｆ（０）〜ｆ（１２７）は、対数尺度で示した周波数軸上に等間隔（ＭＩＤＩにおける半音単位）に設定した周波数ということになる。このため、本願では、図に掲載するグラフにおけるノートナンバー軸を、いずれも対数尺度で示すことにする。
【００２３】
続いて、任意の区間の区間信号に対する各周期関数の相関の求め方について、具体的な説明を行う。例えば、図４に示すように、ある単位区間ｄについて区間信号Ｘが与えられていたとする。ここでは、区間長Ｌをもった単位区間ｄについて、サンプリング周波数Ｆでサンプリングが行なわれており、全部でｗ個のサンプル値が得られているものとし、サンプル番号を図示のように、０，１，２，３，・・・，ｋ，・・・，ｗ−２，ｗ−１とする（白丸で示す第ｗ番目のサンプルは、右に隣接する次の単位区間の先頭に含まれるサンプルとする）。この場合、任意のサンプル番号ｋについては、Ｘ（ｋ）なる振幅値がデジタルデータとして与えられていることになる。短時間フーリエ変換においては、Ｘ（ｋ）に対して各サンプルごとに中央の重みが１に近く、両端の重みが０に近くなるような窓関数Ｗ（ｋ）を乗ずることが通常である。すなわち、Ｘ（ｋ）×Ｗ（ｋ）をＸ（ｋ）と扱って以下のような相関計算を行うもので、窓関数の形状としては余弦波形状のハミング窓が一般に用いられている。ここで、ｗは以下の記述においても定数のような記載をしているが、一般にはｎの値に応じて変化させ、区間長Ｌを超えない範囲で最大となるＦ／ｆ（ｎ）の整数倍の値に設定することが望ましい。
【００２４】
このような区間信号Ｘに対して、第ｎ番目の標準周波数ｆ（ｎ）をもった正弦関数Ｒｎとの相関値を求める原理を示す。両者の相関値Ａ（ｎ）は、図５の第１の演算式によって定義することができる。ここで、Ｘ（ｋ）は、図４に示すように、区間信号Ｘにおけるサンプル番号ｋの振幅値であり、ｓｉｎ（２πｆ（ｎ）ｋ／Ｆ）は、時間軸上での同位置における正弦関数Ｒｎの振幅値である。この第１の演算式は、単位区間ｄ内の全サンプル番号ｋ＝０〜ｗ−１の次元について、それぞれ区間信号Ｘの振幅値と正弦関数Ｒｎの振幅ベクトルの内積を求める式ということができる。
【００２５】
同様に、図５の第２の演算式は、区間信号Ｘと、第ｎ番目の標準周波数ｆ（ｎ）をもった余弦関数との相関値を求める式であり、両者の相関値はＢ（ｎ）で与えられる。なお、相関値Ａ（ｎ）を求めるための第１の演算式も、相関値Ｂ（ｎ）を求めるための第２の演算式も、最終的に２／ｗが乗ぜられているが、これは相関値を規格化するためのものであり、前述のとおりｗはｎに依存して変化させるのが一般的であるため、この係数もｎに依存する変数である。
【００２６】
区間信号Ｘと標準周波数ｆ（ｎ）をもった標準周期関数との相関実効値は、図５の第３の演算式に示すように、正弦関数との相関値Ａ（ｎ）と余弦関数との相関値Ｂ（ｎ）との二乗和平方根値Ｅ（ｎ）によって示すことができる。この相関実効値の大きな標準周期関数の周波数を代表周波数として選出すれば、この代表周波数を用いて区間信号Ｘを符号化することができる。
【００２７】
すなわち、この相関値Ｅ（ｎ）が所定の基準以上の大きさとなる１つまたは複数の標準周波数を代表周波数として選出すれば良い。なお、ここで「相関値Ｅ（ｎ）が所定の基準以上の大きさとなる」という選出条件は、例えば、何らかの閾値を設定しておき、相関値Ｅ（ｎ）がこの閾値を超えるような標準周波数ｆ（ｎ）をすべて代表周波数として選出する、という絶対的な選出条件を設定しても良いが、例えば、相関値Ｅ（ｎ）の大きさの順にＱ番目までを選出する、というような相対的な選出条件を設定しても良い。
【００２８】
（一般化調和解析の手法）
ここでは、本発明に係る音響信号の符号化を行う際に有用な一般化調和解析の手法について説明する。既に説明したように、音響信号を符号化する場合、個々の単位区間内の区間信号について、相関値の高いいくつかの代表周波数を選出することになる。一般化調和解析は、より高い精度で代表周波数の選出を可能にする手法であり、その基本原理は次の通りである。
【００２９】
図６（ａ）に示すような単位区間ｄについて、信号Ｓ（ｊ）なるものが存在するとする。ここで、ｊは後述するように、繰り返し処理のためのパラメータである（ｊ＝１〜Ｊ）。まず、この信号Ｓ（ｊ）に対して、図２に示すような１２８通りの周期関数すべてについての相関値を求める。そして、最大の相関値が得られた１つの周期関数の周波数を代表周波数として選出し、当該代表周波数をもった周期関数を要素関数として抽出する。続いて、図６（ｂ）に示すような含有信号Ｇ（ｊ）を定義する。この含有信号Ｇ（ｊ）は、抽出された要素関数に、その振幅として、当該要素関数の信号Ｓ（ｊ）に対する相関値を乗じることにより得られる信号である。例えば、周期関数として図２に示すように、一対の正弦関数と余弦関数とを用い、周波数ｆ（ｎ）が代表周波数として選出された場合、振幅Ａ（ｎ）をもった正弦関数Ａ（ｎ）ｓｉｎ（２πｆ（ｎ）ｋ／Ｆ）と、振幅Ｂ（ｎ）をもった余弦関数Ｂ（ｎ）ｃｏｓ（２πｆ（ｎ）ｋ／Ｆ）との和からなる信号が含有信号Ｇ（ｊ）ということになる（図６（ｂ）では、図示の便宜上、一方の関数しか示していない）。ここで、Ａ（ｎ），Ｂ（ｎ）は、図５の式で得られる規格化された相関値であるから、結局、含有信号Ｇ（ｊ）は、信号Ｓ（ｊ）内に含まれている周波数ｆ（ｎ）をもった信号成分ということができる。
【００３０】
こうして、含有信号Ｇ（ｊ）が求まったら、信号Ｓ（ｊ）から含有信号Ｇ（ｊ）を減じることにより、差分信号Ｓ（ｊ＋１）を求める。図６（ｃ）は、このようにして求まった差分信号Ｓ（ｊ＋１）を示している。この差分信号Ｓ（ｊ＋１）は、もとの信号Ｓ（ｊ）の中から、周波数ｆ（ｎ）をもった信号成分を取り去った残りの信号成分からなる信号ということができる。そこで、パラメータｊを１だけ増加させることにより、この差分信号Ｓ（ｊ＋１）を新たな信号Ｓ（ｊ）として取り扱い、同様の処理を、パラメータｊをｊ＝１〜Ｊまで１ずつ増やしながらＪ回繰り返し実行すれば、Ｊ個の代表周波数を選出することができる。
【００３１】
このような相関計算の結果として出力されるＪ個の含有信号Ｇ（１）〜Ｇ（Ｊ）は、もとの区間信号Ｘの構成要素となる信号であり、もとの区間信号Ｘを符号化する場合には、これらＪ個の含有信号の周波数を示す情報および振幅（強度）を示す情報を符号データとして用いるようにすれば良い。尚、Ｊは代表周波数の個数であると説明してきたが、標準周波数ｆ（ｎ）の個数と同一すなわちＪ＝１２８であってもよく、周波数スペクトルを求める目的においてはそのように行うのが通例である。
【００３２】
（本発明に係る周波数解析方法および音響信号の符号化方法）
上述のように一般化調和解析では、Ｊ個の含有信号Ｇ（１）〜Ｇ（Ｊ）を符号データとして表現することにより、音響信号の符号化を行なっている。この際、含有信号Ｇ（ｊ）の構成要素となる要素関数、すなわち周期関数は、差分信号Ｓ（ｊ＋１）以降の差分信号との相関値を求める対象から外される。これは、その周波数を有する信号成分は１つだけであるとみなしているためである。
【００３３】
ところが、原音響信号に複数の楽器音が混在する場合、同一周波数の信号成分が複数存在することがある。このような音響信号に対して従来の一般化調和解析の手法を用いて符号化を行なうと、同一周波数については、最大の楽器音に対応する強度情報だけが得られることになり、他の楽器音についての強度情報は無視されることとなる。そこで、本発明では、含有信号Ｇ（ｊ）を構成する周期関数を、差分信号Ｓ（ｊ＋１）以降についても、相関値算出の対象として残すことにより、同一周波数について正確な強度情報を得るようにしている。
【００３４】
続いて、本発明に係る周波数解析方法を図７のフローチャートを用いて具体的に説明する。まず、標準周波数を有する標準周期関数の準備を行なう（ステップＳ１）。本実施形態では、図２に示したように１２８個の標準周期関数を用意する。続いて、準備された標準周期関数と同数の値を格納可能な配列を用意する（ステップＳ２）。本実施形態では、１２８個の標準周期関数に対応するため、配列も各標準周波数に対応した１２８個分の数値が格納可能となっている。このような配列をここでは２つ用意する。１つは相関値を格納するための相関配列、もう１つは強度値を格納する強度配列である。今までの一般化調和解析を利用した手法では、算出された相関値のみを用いて符号データの強度情報としたが、本発明では、同一標準周波数について複数回算出される相関値を強度値として加算し、最終的にこの強度値を符号データの強度情報とすることに特徴がある。
【００３５】
標準周期関数、相関配列、強度配列が準備されたら、図１（ａ）に示したように、時系列信号に対して単位区間を設定して、各単位区間における時系列信号を区間信号として抽出する（ステップＳ３）。次に、区間信号に対して標準周期関数との相関を求め、各相関値を相関配列の各標準周波数に対応する個所に格納する（ステップＳ４）。続いて、強度配列の全ての値を０に設定して初期化する（ステップＳ５）。この強度配列の初期化は各単位区間に対して１回だけ行なわれるものである。続いて、相関配列中の１２８個の相関値の中で最大となる相関値を選出する。さらに選出された相関値は、強度配列中の対応する標準周波数の強度値として加算される（ステップＳ６）。例えば、図８（ａ）に示すように相関配列の中でノートナンバー１００（標準周波数ｆ（１００））に対応する相関値が「７５」で最大であったとする。すると、この相関値「７５」は、強度配列中のノートナンバー１００に対応する強度値「０」に加算されて格納される。この時点の強度配列の状態を図８（ｂ）に示す。
【００３６】
続いて、標準周波数ｆ（１００）に対応する標準周期関数と、標準周波数ｆ（１００）に対応する相関値の積で生成される含有信号Ｇ（１）（この時点ではｊ＝１）を区間信号Ｓ（１）（＝区間信号Ｘ）から減じることにより、差分信号Ｓ（２）を算出する。この手順は前述の図６を用いて説明した手順と同様である。
この差分信号Ｓ（２）は新たな区間信号Ｓ（２）として定義されることになる（ステップＳ７）。
【００３７】
次に、区間信号Ｓ（２）と、含有信号Ｇ（１）の構成要素である標準周波数ｆ（１００）の標準周期関数との相関値を算出し、相関配列中の、標準周波数ｆ（１００）に対応する相関値を更新する（ステップＳ８）。これは、区間信号Ｓ（２）は、区間信号Ｓ（１）との相関が高い信号成分である含有信号Ｇ（１）が減算されたものであるため、含有信号Ｇ（１）の構成要素である標準周波数ｆ（１００）に関する相関値が大きく変化しているために行なわれる。本来は、区間信号が更新されているため、区間信号Ｓ（２）と１２８個全ての標準周期関数との相関値を算出すべきであるが、大きく変化しているのは、減算された含有信号Ｇ（１）の構成要素である標準周波数ｆ（１００）に対応する成分であるので、演算処理の負荷を軽減するため、１つの標準周波数に対してのみ行なう。この結果、相関配列は、図８（ｃ）に示すような状態となったものとする。図８（ｃ）においては、標準周波数ｆ（１００）に対応する相関値が「１０」となり、大幅に減少していることがわかる。
【００３８】
続いて、更新された相関配列の中で最大の相関値「４５」をとる標準周波数ｆ（５０）の標準周期関数と、区間信号Ｓ（２）の相関値を算出し、算出された相関値を相関配列中の、標準周波数ｆ（５０）の相関値として格納する。上述のように、区間信号Ｓ（２）と１２８個全ての標準周期関数との相関値を算出して、相関配列中の相関値１２８個全てを更新しておけば、ここでは新たに相関値の算出をする必要はない。ところが、区間信号Ｓ（２）は、区間信号Ｓ（１）から標準周波数ｆ（１００）のみを有する含有信号Ｇ（１）を減算したものであるため、他の標準周波数に関する成分については、大きな変化はない。そこで、演算量の削減のため、現在の相関配列の中で最大値をとる標準周期関数を、区間信号Ｓ（２）についても相関が最大であるとみなす。ただし、正確な相関値を得るために、当該標準周期関数と区間信号Ｓ（２）の相関値を演算するのである。この結果、相関配列は、図８（ｄ）に示すような状態となる。図８（ｄ）においては、標準周波数ｆ（５０）に対応する相関値が「４７」となり、若干変化していることがわかる。
【００３９】
このようにしてステップＳ７の処理により新たに得られた区間信号Ｓ（２）と、ステップＳ８の処理により更新された相関配列を用いて、ステップＳ６に戻り、上記区間信号Ｓ（１）に対して行なったのと同様な処理を行う。相関配列が図８（ｄ）に示したような状態である場合には、図８（ｅ）に示すように強度配列の標準周波数ｆ（５０）に対応する強度値に「４７」が加算されることになる。
【００４０】
ステップＳ６〜ステップＳ８の処理を設定されたパラメータｊ（ｊ＝１〜Ｊ）分繰り返し行なっていき、ｊ回分処理が実行された時点での強度配列に格納された強度値を出力するようにする。１つの単位区間についての処理が終了したら、時系列順に他の単位区間についても同様の処理を実行する。以上のような周波数解析を行なうことにより、同一周波数について位相違いの複数の成分を含むような時系列信号であっても、各周波数の強度をより正確に算出することが可能となる。
【００４１】
このようにして、時系列信号として音響信号を用い、周波数解析結果に基づいて符号化を行なう場合には、各単位区間について、強度配列に格納された強度値が「０」でないものについて、対応する標準周波数を「音の高さを示す情報」、各標準周波数の強度値を「音の強さを示す情報」、当該単位区間の始点を「音の発音開始時刻を示す情報」、当該単位区間の終点を「音の発音終了時刻を示す情報」として、４つの情報を含む符号データを作成すれば、当該単位区間内の区間信号Ｘを所定数の符号データにより符号化することができる。符号データとして、ＭＩＤＩデータを作成するのであれば、「音の高さを示す情報」としてノートナンバーを用い、「音の強さを示す情報」としてベロシティーを用い、「音の発音開始時刻を示す情報」としてノートオン時刻を用い、「音の発音終了時刻を示す情報」としてノートオフ時刻を用いるようにすれば良い。上記説明のように本発明に係る音響信号の符号化方法によれば、複数の楽器音が混在するような、同一周波数に位相の異なる複数の信号成分が存在する原音響信号に対して、その複数の成分を加算した強度値として音の強さを示す情報に反映させることができ、結果として原音響信号をより忠実に再現することが可能になる。
【００４２】
（相関値の適正評価法１、隣接成分からの影響）
上記ステップＳ１からステップＳ８までの処理を実行することにより、精度の高い周波数解析を行なうことが可能となるが、次に、さらに精度を高めるために得られた相関値の適正な評価法を３つ説明する。まず、ここでは、隣接する標準周波数成分から受ける影響を考慮した手法について説明する。この場合、上記ステップＳ４の相関演算段階で各標準周波数に対する相関値を求め、相関配列にセットした後、各相関値が正しいものか否かを確認する処理を行う。
【００４３】
具体的には、まず、ある標準周波数を対象標準周波数として着目し、この対象標準周波数に隣接する標準周波数を隣接標準周波数とする。例えば、対象標準周波数をｆ（３）とした場合、隣接標準周波数はｆ（４）となる。続いて、隣接標準周波数の隣接標準周期関数と、この隣接標準周波数に対応する相関値の積により得られる含有信号を、区間信号から減算することにより差分信号を算出する。
次に、算出された差分信号と、対象標準周波数を有する対象標準周期関数との相関値を算出する。続いて、この対象標準周波数について算出された相関値と、相関配列に既に格納されている対象標準周波数に対応する相関値との差を求め、この差があらかじめ設定された閾値より大きい場合、相関配列中の対象標準周波数に対応する相関値を「０」とする。一般に、多数の異なる周波数成分を有する時系列信号においては、近接する周波数成分が互いに影響し合っており、そのため、単純に周波数解析を行なうと、本来は存在していない周波数成分が検出されてしまうことがある。本実施形態では、隣接する標準周波数成分を削除した状態で、再度対象とする標準周波数成分の相関値を求め、それが隣接する標準周波数成分の削除前と大きく変化しているか否かを調べることにより、その対象とする標準周波数成分の相関値が本物であるかどうかを確認することができる。
【００４４】
（相関値の適正評価法２、隣接成分への影響）
上記の例では、隣接標準周波数成分から受ける影響を考慮して、対象となる標準周波数成分の相関値が本物であるかどうかを確認した。次に、このような手法に代えて、対象となる標準周波数成分から隣接する標準周波数成分へ与える影響を考慮して、対象となる標準周波数成分の相関値が本物であるかどうかを確認する手法について説明する。この場合も、上記ステップＳ４の相関演算段階で各標準周波数に対する相関値を求め、相関配列にセットした後、各相関値が正しいものか否かを確認する処理を行うことになる。
【００４５】
具体的には、上記の例と同様に、ある標準周波数を対象標準周波数として着目し、この対象標準周波数に隣接する標準周波数を隣接標準周波数とする。すなわち、対象標準周波数をｆ（３）とした場合、隣接標準周波数はｆ（４）となる。
続いて、対象標準周波数の対象標準周期関数と、この対象標準周波数に対応する相関値の積により得られる含有信号を、区間信号から減算することにより差分信号を算出する。次に、算出された差分信号と、隣接標準周波数を有する隣接標準周期関数との相関値を算出する。続いて、この隣接標準周波数について算出された相関値と、相関配列に既に格納されている隣接標準周波数に対応する相関値との差を求め、この差があらかじめ設定された閾値より小さい場合、相関配列中の対象標準周波数に対応する相関値を「０」とする。この手法によれば、対象とする標準周波数成分を削除した状態で、再度隣接する標準周波数成分の相関値を求め、それが対象とする標準周波数成分の削除前と大きく変化しているか否かを調べることにより、対象とする標準周波数成分の相関値が本物であるかどうかを確認することができる。
【００４６】
（相関値の適正評価法３、微細周波数の偏り）
次に、上記のような標準周波数よりもさらに狭い間隔で周波数を定義するとともに定義された周波数に対応する周期関数を用意し、これらの周期関数と区間信号との相関値を利用して精度の高い周波数解析を行なう手法について説明する。
このような狭い間隔で定義された周波数を本明細書では、微細周波数と呼び、微細周波数に対応する周期関数を微細周期関数と呼ぶことにする。微細周波数としては、隣接する標準周波数間に所定数設定される。また、微細周波数間の間隔は、標準周波数の場合と同様に等比級数となるように設定される。ここで、微細周波数を各標準周波数間に１２個設定した例を図９に示す。図９に示すように、標準周波数ｆ（ｎ）と標準周波数ｆ（ｎ＋１）の間に微細周波数ｆ（ｎ＋１／１３）〜微細周波数ｆ（ｎ＋１２／１３）の１２個が設定されている。図９中、ノートナンバーｎ＋６／１３とノートナンバーｎ＋７／１３の間の点線は、各標準周波数とみなされる微細周波数の範囲を示す。この「各標準周波数とみなされる」とは、各標準周波数に対応する相関値として相関配列に格納されることを示している。図９においては、ノートナンバーｎ＋６／１３まではノートナンバーｎの周波数範囲とみなされ、ノートナンバーｎ＋７／１３からはノートナンバーｎ＋１とみなされることを示している。
【００４７】
このような微細周波数に対して図２に示した標準周期関数と同じ形式の微細周期関数が用意され、各微細周期関数と区間信号との相関値が算出される。そして各標準周波数の範囲となる１３個（１個の標準周波数と前後６個ずつの微細周波数）の周波数の相関値のうち、最大のものがその標準周波数の相関値として相関配列に格納される。例えば、ノートナンバーｎ（標準周波数ｆ（ｎ））の相関値は、ノートナンバーｎ−６／１３からノートナンバーｎ＋６／１３までの１３個の相関値で最大のものが設定されることになる。ここで、最大の相関値を有する周波数がノートナンバーｎ−６／１３や、ノートナンバーｎ＋６／１３のように標準周波数範囲の端に位置する場合には、対象とする標準周波数と隣接する標準周波数の境界部に存在する成分である可能性もあるが、隣接する標準周波数成分の影響により対象とする標準周波数の範囲内には本来存在しない成分である可能性が高い。前者と後者を判別するためには、隣接する標準周波数の微細周波数に注目し、隣接する標準周波数が対象とする標準周波数に重なるように標準周波数範囲の端に位置している場合には前者であり、それ以外に位置する場合は後者とみなし、後者の場合には相関配列中の対応する標準周波数の相関値を「０」に設定する。前者の場合は、対象とする標準周波数成分と隣接する標準周波数成分が重複して算出されてしまうため、相関配列中の対象とする標準周波数の相関値または隣接する標準周波数の相関値を「０」に設定し、本実施形態では相関値が低い方を「０」に設定するようにしている。
【００４８】
【発明の効果】
以上、説明したように本発明によれば、解析すべき周波数を標準周波数として設定すると共に、各標準周波数に対応する周期関数を標準周期関数として準備し、各標準周波数に対応する相関値を格納するための相関配列、および前記各標準周波数に対応する強度値を格納するための強度配列を準備し、与えられた時系列信号の時間軸上に複数の単位区間を設定し、各単位区間ごとの時系列信号を区間信号として抽出し、抽出された区間信号に対して、複数の標準周期関数との相関を求めることにより、各標準周波数に対応する相関値を算出し、相関配列に各相関値を格納し、強度配列の全ての値を０に設定し、相関配列中で最大の相関値を選出し、強度配列中の対応する標準周波数の強度値に加算し、選出された最大相関値と対応する標準周期関数の積で生成される含有信号を前記区間信号から減じることにより差分信号を算出し、算出された差分信号を新たな区間信号とすることにより区間信号を更新し、区間信号と含有信号の構成要素となった標準周期関数との相関値を算出し、算出された相関値を対応する標準周波数の相関値として更新した後、当該更新された相関配列中で最大の相関値に対応する標準周期関数を選出し、区間信号と当該標準周期関数との相関値を対応する標準周波数の相関値として更新し、前記強度値の加算、前記区間信号の更新、前記相関値の更新を繰り返し実行することにより得られる強度配列を、その単位区間における全標準周波数各々に対応する強度値として決定し、このような強度配列の決定を、全単位区間に対して行なうことにより、時系列信号の全単位区間に対する強度配列を得るようにしたので、時系列信号における各周波数成分の強度情報をより正確に求めることが可能となる。特に、時系列信号として音響信号に適用した場合には、複数の楽器音が混在するような音響信号に対しても、原音により忠実な再現を行なうことが可能となるという効果を奏する。
【図面の簡単な説明】
【図１】本発明の音響信号の符号化方法の基本原理を示す図である。
【図２】本発明で利用される周期関数の一例を示す図である。
【図３】図２に示す各周期関数の周波数とＭＩＤＩノートナンバーｎとの関係式を示す図である。
【図４】解析対象となる信号と周期信号との相関計算の手法を示す図である。
【図５】図４に示す相関計算を行うための計算式を示す図である。
【図６】一般化調和解析の基本的な手法を示す図である。
【図７】本発明に係る周波数解析方法を示すフローチャートである。
【図８】本発明における相関配列、強度配列の変化の様子を示す図である。
【図９】各標準周波数間に１２個の微細周波数を設定した状態を示す図である。
【符号の説明】
Ａ（ｎ），Ｂ（ｎ）・・・相関値
ｄ，ｄ１〜ｄ５・・・単位区間
Ｅ（ｎ）・・・相関値
Ｇ（ｊ）・・・含有信号
ｎ，ｎ１〜ｎ６・・・ノートナンバー
Ｓ（ｊ），Ｓ（ｊ＋１）・・・差分信号
Ｘ，Ｘ（ｋ）・・・区間信号[0001]
[Industrial application fields]
The present invention includes broadcast media (radio, television), communication media (CS video / audio distribution, Internet music distribution, communication karaoke), package media (CD, MD, cassette, video, LD, CD-ROM, game cassette, mobile phone). Production of various audio contents provided by a solid-state memory medium for music players), music content including vocals for dedicated mobile music players, mobile phones, PHS, pagers, literary arts such as Kabuki, Noh, Reading, Poetry The present invention relates to an audio signal encoding technique suitable for use in MIDI transmission of audio material of a work or audio teaching material for language education.
[0002]
[Prior art]
A time-series signal represented by an acoustic signal includes a plurality of periodic signals as its constituent elements. For this reason, a method for analyzing what kind of periodic signal is included in a given time-series signal has been known for a long time. For example, Fourier analysis is widely used as a method for analyzing frequency components included in a given time series signal.
[0003]
By using such a time-series signal analysis method, an acoustic signal can be encoded. With the spread of computers, it has become easy to sample an analog audio signal as the original sound at a predetermined sampling frequency, quantize the signal intensity at each sampling, and capture it as digital data. If a method such as Fourier analysis is applied to the data and the frequency components included in the original sound signal are extracted, the original sound signal can be encoded by a code indicating each frequency component.
[0004]
On the other hand, the MIDI (Musical Instrument Digital Interface) standard, which was born from the idea of encoding musical instrument sounds by electronic musical instruments, has been actively used with the spread of personal computers. The code data according to the MIDI standard (hereinafter referred to as MIDI data) is basically data that describes the operation of the musical instrument performance such as which keyboard key of the instrument is played with what strength. The data itself does not include the actual sound waveform. Therefore, when reproducing the actual sound, a MIDI sound source storing the waveform of the instrument sound is separately required. However, its high encoding efficiency is attracting attention, and encoding and decoding according to the MIDI standard are being attracted attention. This technology is now widely used in software that uses a personal computer to perform musical instrument performance, practice and compose music.
[0005]
Therefore, by analyzing a time-series signal represented by an acoustic signal by a predetermined method, a periodic signal as a constituent element is extracted, and the extracted periodic signal is encoded using MIDI data. Proposals have been made. For example, JP-A-10-247099, JP-A-11-73199, JP-A-11-73200, JP-A-11-95753, JP-A-2000-99009, JP-A-2000-99092, JP-A-2000-99093, JP-A-2000-261322, Japanese Patent Application No. 11-177875, Japanese Patent Application No. 11-329297, and Japanese Patent Application No. 2000-68521 are disclosed at any time. Various methods have been proposed that can analyze the frequency that is a component of the series signal and create MIDI data from the analysis result.
[0006]
[Problems to be solved by the invention]
The MIDI encoding method proposed in each of the above publications or specifications has enabled efficient encoding of acoustic signals obtained from performance recordings and the like. In particular, in the encoding method described in Japanese Patent Laid-Open No. 2000-261322, the frequency resolution is remarkably improved by using generalized harmonic analysis as compared with the conventional method using short-time Fourier transform. Successful. This method based on generalized harmonic analysis has an effect of suppressing a pseudo frequency component generated in calculation by short-time Fourier transform. Furthermore, Japanese Patent Application No. 2000-68521 proposes a method of extracting a true frequency component by comparing the results of both short-time Fourier transform and generalized harmonic analysis.
[0007]
The encoding method using the generalized harmonic analysis in the prior invention described above approximates the original sound signal by deleting the contained signal components of the standard frequency having the maximum correlation one by one from the given sound signal. A set of included signal components to be determined is determined, and an original acoustic signal is expressed by a set of code data corresponding to each included signal. In the above-mentioned prior application, after the contained signal component is deleted from the acoustic signal to be analyzed, the standard frequency of the contained component is excluded from the object to be extracted as the contained signal component from the next time onward, so it has already been extracted. The contained signal component having the same frequency as the contained signal component was not extracted even if it should be extracted originally. If encoding is performed in such a state, the pitch may deviate by a few semitones from the original performance.
[0008]
In view of the above, it is an object of the present invention to provide a frequency analysis method and an audio signal encoding method capable of accurately reproducing an audio signal obtained from performance recording or the like.
[0009]
[Means for Solving the Problems]
In order to solve the above problems, in the present invention, a frequency to be analyzed is set as a standard frequency, a periodic function corresponding to each standard frequency is prepared as a standard periodic function, and a correlation value corresponding to each standard frequency is stored. A correlation array for, and an intensity array for storing intensity values corresponding to each of the standard frequencies is provided and givenacousticSet multiple unit intervals on the time axis of the signal,acousticThe signal is extracted as an interval signal, and the correlation value corresponding to each standard frequency is calculated by obtaining the correlation with multiple standard periodic functions for the extracted interval signal, and each correlation value is stored in the correlation array All values of the intensity array are set to 0, the maximum correlation value in the correlation array is selected, added to the intensity value of the corresponding standard frequency in the intensity array, and corresponding to the selected maximum correlation value The difference signal is calculated by subtracting the inclusion signal generated by the product of the standard periodic function from the interval signal, and the interval signal is updated by using the calculated difference signal as a new interval signal. After calculating the correlation value with the standard periodic function that is a component of, and updating the calculated correlation value as the correlation value of the corresponding standard frequency, it corresponds to the maximum correlation value in the updated correlation array Select standard periodic function An intensity array obtained by updating the correlation value between the interval signal and the standard periodic function as a correlation value of the corresponding standard frequency, and repeatedly executing addition of the intensity value, update of the interval signal, and update of the correlation value Are determined as intensity values corresponding to all the standard frequencies in the unit interval. Furthermore, in the present invention, by determining such an intensity sequence for all unit sections,acousticIt is characterized in that an intensity array for all unit sections of a signal is obtained. According to the present invention, a correlation array and an intensity array for storing correlation values and intensity values corresponding to each standard frequency are prepared, and the correlation value is used as it is for the standard frequency having the maximum correlation value. Since it was stored in the intensity array as intensity values instead of frequency intensity information, the correlation value was added as an intensity value when the correlation value of the same standard frequency was maximized again. ,acousticIt becomes possible to obtain the intensity information of each frequency component in the signal more accurately.TheEven an acoustic signal in which a plurality of instrument sounds are mixed can be reproduced more faithfully by the original sound.
[0010]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[0011]
(Fundamental principle of frequency analysis and acoustic signal coding method)
First, the basic principle of the frequency analysis method and the acoustic signal encoding method according to the present invention will be described. Since this basic principle is disclosed in the above-mentioned publications or specifications, only the outline will be briefly described here.
[0012]
As shown in FIG. 1A, it is assumed that an analog acoustic signal is given as a time-series signal. In the example of FIG. 1, the acoustic signal is shown with time t on the horizontal axis and amplitude (intensity) on the vertical axis. Here, first, the analog sound signal is processed as digital sound data. This may be performed by using a conventional general PCM method, sampling the analog acoustic signal at a predetermined sampling frequency, and converting the amplitude into digital data using a predetermined number of quantization bits. Here, for convenience of explanation, the waveform of the acoustic data digitized by the PCM method is also shown by the same waveform as the analog acoustic signal of FIG.
[0013]
Subsequently, a plurality of unit sections are set on the time axis of the acoustic signal to be analyzed. In the example shown in FIG. 1A, six times t1 to t6 are defined at equal intervals on the time axis t, and five unit intervals d1 to d5 having these times as the start point and the end point are set. In the example of FIG. 1, unit sections having the same section length are set, but the section length may be changed for each unit section. Alternatively, the section setting may be performed such that adjacent unit sections partially overlap on the time axis.
[0014]
When the unit section is set in this way, representative frequencies are selected for the acoustic signals (hereinafter referred to as section signals) for each unit section. Each section signal usually includes various frequency components. For example, a frequency component having a large component intensity ratio may be selected as the representative frequency. Here, the so-called fundamental frequency is generally used as the representative frequency, but a harmonic frequency such as a formant frequency of speech or a peak frequency of a noise source may be treated as a representative frequency. Although only one representative frequency may be selected, more accurate encoding is possible by selecting a plurality of representative frequencies depending on the acoustic signal. FIG. 1B shows an example in which three representative frequencies are selected for each unit section, and one representative frequency is encoded as one representative code (shown as a note for convenience in the drawing). Has been. Here, three tracks T1, T2, and T3 are provided to accommodate representative codes (notes). This is because three representative codes selected for each unit section are assigned to different tracks. It is for accommodating.
[0015]
For example, representative codes n (d1,1), n (d1,2), n (d1,3) selected for the unit section d1 are accommodated in tracks T1, T2, T3, respectively. Here, each code n (d1,1), n (d1,2), n (d1,3) is a code indicating a note number in the MIDI code. The note number in the MIDI code takes 128 values from 0 to 127, each indicating one key of the piano keyboard. Specifically, for example, when 440 Hz is selected as the representative frequency, this frequency corresponds to the note number n = 69 (corresponding to “ra sound (A3 sound)” in the center of the piano keyboard). N = 69 is selected. However, FIG. 1B is a conceptual diagram showing the representative codes obtained by the above-described method in the form of musical notes. Actually, data relating to strength is also added to each musical note. For example, the track T1 includes e (d1,1), e (d2,1)... Along with data indicating the pitches of note numbers n (d1,1), n (d2,1). Data indicating the strength is accommodated. The data indicating the intensity is determined by the degree to which the component of each representative frequency is included in the original section signal. Specifically, the data indicating the intensity is determined based on the correlation value with respect to the section signal of the periodic function having each representative frequency. Further, in the conceptual diagram shown in FIG. 1B, the position of each unit section on the time axis is indicated by the position of the note in the horizontal direction, but in reality, the position on the time axis is shown. Is accurately added as a numerical value to each note.
[0016]
As a format for encoding an acoustic signal, it is not always necessary to adopt the MIDI format. However, since the MIDI format is the most popular as this type of encoding, code data in the MIDI format is practically used. Is preferred. In the MIDI format, “note-on” data or “note-off” data exists while interposing “delta time” data. The “note-on” data is data for designating a specific note number N and velocity V to instruct the start of a specific sound, and the “note-off” data is a specific note number N and velocity V. This is data that designates the end of the performance of a specific sound. The “delta time” data is data indicating a predetermined time interval. Velocity V is a parameter that indicates, for example, the speed at which a piano keyboard is pressed down (velocity at the time of note-on) and the speed at which the finger is released from the keyboard (velocity at the time of note-off). Or it shows the strength of the performance end operation.
[0017]
In the above-described method, J note numbers n (di, 1), n (di, 2),..., N (di, J) are obtained as representative codes for the i-th unit interval di. Intensities e (di, 1), e (di, 2),..., E (di, J) are obtained for each of these. Therefore, MIDI format code data can be created by the following method. First, as the note number N described in the “note on” data or “note off” data, the obtained note numbers n (di, 1), n (di, 2),..., N (di , J) can be used as they are. On the other hand, as the velocity V described in the “note on” data or “note off” data, the obtained intensities e (di, 1), e (di, 2),..., E (di, A value obtained by normalizing J) by a predetermined method may be used. The “delta time” data may be set according to the length of each unit section.
[0018]
(Specific method for obtaining correlation with periodic function)
In the method based on the basic principle described above, one or a plurality of representative frequencies are selected for the section signal, and the section signal is represented by a periodic signal having this representative frequency. Here, the representative frequency to be selected is literally a frequency representing the signal component in the unit section. Specific methods for selecting the representative frequency include a method using a short-time Fourier transform and a method using a generalized harmonic analysis method, as will be described later. Both methods have the same basic concept. Prepare a plurality of periodic functions with different frequencies in advance, and from these periodic functions, a periodic function that has a high correlation with the section signal in the unit section. And a method of selecting the frequency of the highly correlated periodic function as a representative frequency is adopted. That is, when selecting a representative frequency, an operation for obtaining a correlation between a plurality of periodic functions prepared in advance and a section signal in a unit section is performed. Therefore, here, a specific method for obtaining the correlation with the periodic function will be described.
[0019]
Assume that trigonometric functions as shown in FIG. 2 are prepared as a plurality of periodic functions. These trigonometric functions are composed of a pair of a sine function and a cosine function having the same frequency. For each of 128 standard frequencies f (0) to f (127), a pair of a sine function and a cosine function. Is defined. Here, a pair of functions consisting of a sine function and a cosine function having the same frequency is defined as a periodic function for the frequency. That is, the periodic function for a specific frequency is constituted by a pair of sine function and cosine function. Thus, the periodic function is defined by a pair of sine function and cosine function in order to consider that the correlation value is influenced by the phase when obtaining the correlation value of the periodic function with respect to the signal. The variables F and k in each trigonometric function shown in FIG. 2 are variables corresponding to the sampling frequency F and the sample number k for the section signal X. For example, a sine wave with respect to the frequency f (0) is represented by sin (2πf (0) k / F), and given an arbitrary sample number k, the same time position as the k-th sample constituting the section signal The amplitude value of the periodic function at is obtained.
[0020]
Here, an example in which 128 standard frequencies f (0) to f (127) are defined by the equations as shown in FIG. That is, the nth (0 ≦ n ≦ 127) standard frequency f (n) is defined by the following (Formula 1).
[0021]
(Formula 1)
f (n) = 440 × 2^γ ⁽ⁿ⁾
γ (n) = (n−69) / 12
[0022]
If the standard frequency is defined by such an expression, it is convenient when finally encoding using MIDI data is performed. This is because the 128 standard frequencies f (0) to f (127) set by such a definition take frequency values forming a geometric series, and correspond to the note numbers used in the MIDI data. This is because it becomes a frequency. Therefore, the 128 standard frequencies f (0) to f (127) shown in FIG. 2 are frequencies set at equal intervals (in semitone units in MIDI) on the frequency axis shown on the logarithmic scale. For this reason, in this application, the note number axis | shaft in the graph published on a figure will show all in a logarithmic scale.
[0023]
Next, a specific description will be given of how to obtain the correlation of each periodic function with respect to a section signal in an arbitrary section. For example, as shown in FIG. 4, it is assumed that a section signal X is given for a certain unit section d. Here, it is assumed that sampling is performed at the sampling frequency F for the unit interval d having the interval length L, and w sample values are obtained in total, and the sample numbers are 0, 1, 2, 3,..., K,..., W-2, w-1 (the w-th sample indicated by a white circle is a sample included at the head of the next unit section adjacent to the right. And). In this case, for an arbitrary sample number k, an amplitude value of X (k) is given as digital data. In the short-time Fourier transform, it is usual to multiply the window function W (k) such that the center weight is close to 1 and the weights at both ends are close to 0 for each sample with respect to X (k). That is, X (k) × W (k) is treated as X (k) and the following correlation calculation is performed. As the shape of the window function, a cosine wave-shaped Hamming window is generally used. Here, w is described as a constant in the following description, but in general, it is changed according to the value of n, and F / f (n) that is maximum within a range not exceeding the section length L. It is desirable to set the value to an integer multiple.
[0024]
The principle of obtaining a correlation value with such a section signal X and the sine function Rn having the nth standard frequency f (n) is shown. Both correlation values A (n) can be defined by the first arithmetic expression of FIG. Here, X (k) is the amplitude value of the sample number k in the section signal X, as shown in FIG. 4, and sin (2πf (n) k / F) is the sine at the same position on the time axis. This is the amplitude value of the function Rn. This first arithmetic expression can be said to be an expression for obtaining the inner product of the amplitude value of the section signal X and the amplitude vector of the sine function Rn for the dimensions of all sample numbers k = 0 to w−1 in the unit section d. .
[0025]
Similarly, the second arithmetic expression in FIG. 5 is an expression for obtaining a correlation value between the interval signal X and the cosine function having the nth standard frequency f (n), and the correlation value between the two is B ( n). The first arithmetic expression for obtaining the correlation value A (n) and the second arithmetic expression for obtaining the correlation value B (n) are finally multiplied by 2 / w. Is for normalizing the correlation value. Since w is generally changed depending on n as described above, this coefficient is also a variable depending on n.
[0026]
The effective correlation value between the interval signal X and the standard periodic function having the standard frequency f (n) is the correlation value A (n) with the sine function, the cosine function, as shown in the third arithmetic expression of FIG. Of the square sum of squares E (n) with the correlation value B (n). If the frequency of the standard periodic function having a large correlation effective value is selected as the representative frequency, the section signal X can be encoded using this representative frequency.
[0027]
That is, one or a plurality of standard frequencies whose correlation value E (n) is greater than or equal to a predetermined reference may be selected as the representative frequency. Here, the selection condition that “correlation value E (n) is greater than or equal to a predetermined reference” is, for example, a standard in which some threshold value is set and correlation value E (n) exceeds this threshold value. An absolute selection condition that all frequencies f (n) are selected as representative frequencies may be set. For example, up to the Qth in the order of the correlation value E (n) is selected. A relative selection condition may be set.
[0028]
(Method of generalized harmonic analysis)
Here, a generalized harmonic analysis technique useful when encoding an acoustic signal according to the present invention will be described. As already described, when encoding an acoustic signal, several representative frequencies having high correlation values are selected for the section signal in each unit section. Generalized harmonic analysis is a technique that enables the selection of representative frequencies with higher accuracy, and the basic principle thereof is as follows.
[0029]
Assume that there is a signal S (j) for the unit interval d as shown in FIG. Here, j is a parameter for repetitive processing (j = 1 to J), as will be described later. First, correlation values for all 128 periodic functions as shown in FIG. 2 are obtained for this signal S (j). Then, the frequency of one periodic function having the maximum correlation value is selected as a representative frequency, and the periodic function having the representative frequency is extracted as an element function. Subsequently, the inclusion signal G (j) as shown in FIG. 6B is defined. The inclusion signal G (j) is a signal obtained by multiplying the extracted element function by the correlation value of the element function with respect to the signal S (j) of the element function. For example, as shown in FIG. 2, when a frequency f (n) is selected as a representative frequency using a pair of sine function and cosine function as shown in FIG. 2, a sine function A (n) having an amplitude A (n). ) Sin (2πf (n) k / F) and a signal composed of the sum of cosine function B (n) cos (2πf (n) k / F) having amplitude B (n) is included signal G (j) (In FIG. 6B, only one function is shown for convenience of illustration). Here, since A (n) and B (n) are normalized correlation values obtained by the equation of FIG. 5, the inclusion signal G (j) is eventually included in the signal S (j). It can be said that the signal component has a certain frequency f (n).
[0030]
When the inclusion signal G (j) is obtained in this way, the difference signal S (j + 1) is obtained by subtracting the inclusion signal G (j) from the signal S (j). FIG. 6C shows the difference signal S (j + 1) obtained in this way. This differential signal S (j + 1) can be said to be a signal composed of the remaining signal components obtained by removing the signal component having the frequency f (n) from the original signal S (j). Therefore, the difference signal S (j + 1) is handled as a new signal S (j) by increasing the parameter j by 1, and the same processing is performed J times while increasing the parameter j by 1 from j = 1 to J. If it is repeatedly executed, J representative frequencies can be selected.
[0031]
The J inclusion signals G (1) to G (J) output as a result of such correlation calculation are signals that are constituent elements of the original section signal X, and the original section signal X is encoded. In this case, information indicating the frequency of these J inclusion signals and information indicating the amplitude (intensity) may be used as the code data. Although J has been described as the number of representative frequencies, it may be the same as the number of standard frequencies f (n), that is, J = 128. For the purpose of obtaining a frequency spectrum, this is usually done. It is.
[0032]
(Frequency analysis method and acoustic signal encoding method according to the present invention)
As described above, in the generalized harmonic analysis, the acoustic signals are encoded by expressing the J inclusion signals G (1) to G (J) as code data. At this time, the element function that is a constituent element of the contained signal G (j), that is, the periodic function, is excluded from the object of obtaining the correlation value with the difference signal after the difference signal S (j + 1). This is because it is assumed that there is only one signal component having that frequency.
[0033]
However, when a plurality of instrument sounds are mixed in the original sound signal, there may be a plurality of signal components having the same frequency. When such a sound signal is encoded using the conventional generalized harmonic analysis technique, only the intensity information corresponding to the maximum instrument sound can be obtained for the same frequency. Intensity information about the sound will be ignored. Therefore, in the present invention, accurate intensity information for the same frequency is obtained by leaving the periodic function constituting the inclusion signal G (j) as a target for calculating the correlation value even after the difference signal S (j + 1). ing.
[0034]
Next, the frequency analysis method according to the present invention will be specifically described with reference to the flowchart of FIG. First, a standard periodic function having a standard frequency is prepared (step S1). In this embodiment, 128 standard periodic functions are prepared as shown in FIG. Subsequently, an array capable of storing the same number of values as the prepared standard periodic function is prepared (step S2). In this embodiment, since 128 standard periodic functions are supported, the array can also store 128 numerical values corresponding to each standard frequency. Two such arrays are prepared here. One is a correlation array for storing correlation values, and the other is an intensity array for storing intensity values. In the method using the generalized harmonic analysis so far, only the calculated correlation value is used as the intensity information of the code data. However, in the present invention, the correlation value calculated multiple times for the same standard frequency is used as the intensity value. The characteristic is that the addition is performed, and finally the intensity value is used as the intensity information of the code data.
[0035]
When the standard periodic function, correlation array, and intensity array are prepared, as shown in FIG. 1A, a unit interval is set for the time series signal, and the time series signal in each unit interval is extracted as the interval signal. (Step S3). Next, a correlation with the standard periodic function is obtained for the section signal, and each correlation value is stored in a location corresponding to each standard frequency in the correlation array (step S4). Subsequently, all values of the intensity array are set to 0 and initialized (step S5). The initialization of the intensity array is performed only once for each unit section. Subsequently, the maximum correlation value is selected from among the 128 correlation values in the correlation array. Further, the selected correlation value is added as the intensity value of the corresponding standard frequency in the intensity array (step S6). For example, as shown in FIG. 8A, it is assumed that the correlation value corresponding to the note number 100 (standard frequency f (100)) in the correlation array is “75”, which is the maximum. Then, the correlation value “75” is added to the intensity value “0” corresponding to the note number 100 in the intensity array and stored. The state of the intensity array at this time is shown in FIG.
[0036]
Subsequently, the inclusion signal G (1) (j = 1 at this time) generated by the product of the standard periodic function corresponding to the standard frequency f (100) and the correlation value corresponding to the standard frequency f (100) is sectioned. The difference signal S (2) is calculated by subtracting from the signal S (1) (= interval signal X). This procedure is the same as the procedure described with reference to FIG.
This difference signal S (2) is defined as a new section signal S (2) (step S7).
[0037]
Next, a correlation value between the section signal S (2) and the standard periodic function of the standard frequency f (100) that is a component of the inclusion signal G (1) is calculated, and the standard frequency f (100 in the correlation array is calculated. ) Is updated (step S8). This is because the section signal S (2) is obtained by subtracting the inclusion signal G (1), which is a signal component having a high correlation with the section signal S (1). This is performed because the correlation value for the standard frequency f (100) is greatly changed. Originally, since the section signal is updated, the correlation value between the section signal S (2) and all 128 standard periodic functions should be calculated. Since it is a component corresponding to the standard frequency f (100) that is a constituent element of the signal G (1), it is performed only for one standard frequency in order to reduce the load of arithmetic processing. As a result, the correlation array is assumed to be in a state as shown in FIG. In FIG. 8C, it can be seen that the correlation value corresponding to the standard frequency f (100) is “10”, which is greatly reduced.
[0038]
Subsequently, the standard periodic function of the standard frequency f (50) taking the maximum correlation value “45” in the updated correlation array and the correlation value of the section signal S (2) are calculated, and the calculated correlation value Is stored as a correlation value of the standard frequency f (50) in the correlation array. As described above, if correlation values between the interval signal S (2) and all 128 standard periodic functions are calculated and all 128 correlation values in the correlation array are updated, a new correlation value is obtained here. There is no need to calculate. However, the section signal S (2) is obtained by subtracting the inclusion signal G (1) having only the standard frequency f (100) from the section signal S (1). There is no change. Therefore, in order to reduce the amount of calculation, the standard periodic function having the maximum value in the current correlation array is regarded as having the maximum correlation for the interval signal S (2). However, in order to obtain an accurate correlation value, the correlation value between the standard periodic function and the section signal S (2) is calculated. As a result, the correlation array is in a state as shown in FIG. In FIG. 8D, it can be seen that the correlation value corresponding to the standard frequency f (50) is “47”, which is slightly changed.
[0039]
Using the section signal S (2) newly obtained by the process of step S7 in this way and the correlation array updated by the process of step S8, the process returns to step S6, and the section signal S (1) The same processing as that described above is performed. When the correlation array is in the state as shown in FIG. 8D, “47” is added to the intensity value corresponding to the standard frequency f (50) of the intensity array as shown in FIG. 8E. Will be.
[0040]
The processes in steps S6 to S8 are repeated for the set parameter j (j = 1 to J), and the intensity values stored in the intensity array at the time when the processes for j times are executed are output. . When the process for one unit section is completed, the same process is executed for the other unit sections in chronological order. By performing the frequency analysis as described above, the intensity of each frequency can be calculated more accurately even for a time-series signal including a plurality of components having different phases for the same frequency.
[0041]
In this way, when an acoustic signal is used as a time-series signal and encoding is performed based on the frequency analysis result, for each unit section, the intensity value stored in the intensity array is not “0”. The standard frequency to be used is “information indicating the pitch of the sound”, the intensity value of each standard frequency is “information indicating the intensity of the sound”, the start point of the unit section is “information indicating the start time of sound generation”, the unit If code data including four pieces of information is created with the end point of the section as “information indicating the end time of sound generation”, the section signal X in the unit section can be encoded with a predetermined number of code data. If MIDI data is created as code data, a note number is used as “information indicating the pitch of the sound”, velocity is used as the “information indicating the intensity of the sound”, and “sound generation start time is set. The note-on time may be used as the “information indicating” and the note-off time may be used as the “information indicating the end time of sound generation”. As described above, according to the method of encoding an acoustic signal according to the present invention, for an original acoustic signal having a plurality of signal components having different phases at the same frequency, such as a plurality of musical instrument sounds, The intensity value obtained by adding a plurality of components can be reflected in the information indicating the intensity of the sound, and as a result, the original sound signal can be reproduced more faithfully.
[0042]
(Correlation value appropriate evaluation method 1, influence from adjacent components)
By executing the processing from step S1 to step S8, it is possible to perform a frequency analysis with high accuracy. Next, an appropriate evaluation method for the correlation value obtained in order to further improve the accuracy is described. I will explain. First, here, a method that takes into account the influence received from adjacent standard frequency components will be described. In this case, after obtaining the correlation value for each standard frequency in the correlation calculation stage in step S4 and setting it in the correlation array, a process for confirming whether each correlation value is correct is performed.
[0043]
Specifically, first, a certain standard frequency is focused on as a target standard frequency, and a standard frequency adjacent to the target standard frequency is set as an adjacent standard frequency. For example, when the target standard frequency is f (3), the adjacent standard frequency is f (4). Subsequently, the difference signal is calculated by subtracting the inclusion signal obtained by the product of the adjacent standard frequency function of the adjacent standard frequency and the correlation value corresponding to the adjacent standard frequency from the interval signal.
Next, a correlation value between the calculated difference signal and the target standard periodic function having the target standard frequency is calculated. Subsequently, the difference between the correlation value calculated for the target standard frequency and the correlation value corresponding to the target standard frequency already stored in the correlation array is obtained, and if this difference is greater than a preset threshold, The correlation value corresponding to the target standard frequency in the array is set to “0”. In general, in a time-series signal having a large number of different frequency components, adjacent frequency components influence each other. Therefore, when frequency analysis is simply performed, frequency components that do not originally exist are detected. Sometimes. In the present embodiment, the correlation value of the target standard frequency component is obtained again in a state where the adjacent standard frequency component is deleted, and it is checked whether or not it is greatly changed from that before the deletion of the adjacent standard frequency component. Thus, it is possible to confirm whether the correlation value of the target standard frequency component is genuine.
[0044]
(Correlation value appropriate evaluation method 2, effect on adjacent components)
In the above example, the influence received from the adjacent standard frequency component is taken into consideration, and it is confirmed whether or not the correlation value of the target standard frequency component is genuine. Next, in place of such a method, a method for confirming whether the correlation value of the target standard frequency component is genuine in consideration of the influence of the target standard frequency component on the adjacent standard frequency component Will be described. Also in this case, after obtaining the correlation value for each standard frequency in the correlation calculation stage of step S4 and setting it in the correlation array, processing for confirming whether each correlation value is correct is performed.
[0045]
Specifically, as in the above example, attention is paid to a certain standard frequency as the target standard frequency, and the standard frequency adjacent to the target standard frequency is set as the adjacent standard frequency. That is, when the target standard frequency is f (3), the adjacent standard frequency is f (4).
Subsequently, the difference signal is calculated by subtracting the inclusion signal obtained by the product of the target standard periodic function of the target standard frequency and the correlation value corresponding to the target standard frequency from the section signal. Next, a correlation value between the calculated difference signal and the adjacent standard periodic function having the adjacent standard frequency is calculated. Subsequently, the difference between the correlation value calculated for the adjacent standard frequency and the correlation value corresponding to the adjacent standard frequency already stored in the correlation array is obtained, and if this difference is smaller than a preset threshold, The correlation value corresponding to the target standard frequency in the array is set to “0”. According to this method, in the state where the target standard frequency component is deleted, the correlation value of the adjacent standard frequency component is obtained again, and it is determined whether or not it is greatly changed from before the target standard frequency component is deleted. By examining, it is possible to confirm whether or not the correlation value of the target standard frequency component is genuine.
[0046]
(Correlation value proper evaluation method 3, fine frequency bias)
Next, the frequency is defined at a narrower interval than the standard frequency as described above, and a periodic function corresponding to the defined frequency is prepared, and the correlation value between these periodic functions and the interval signal is used to obtain the accuracy. A method for performing high frequency analysis will be described.
In this specification, such a frequency defined with a narrow interval is called a fine frequency, and a periodic function corresponding to the fine frequency is called a fine periodic function. A predetermined number of fine frequencies are set between adjacent standard frequencies. Further, the interval between the fine frequencies is set to be a geometric series as in the case of the standard frequency. Here, FIG. 9 shows an example in which twelve fine frequencies are set between the standard frequencies. As shown in FIG. 9, twelve fine frequencies f (n + 1/13) to fine frequency f (n + 12/13) are set between the standard frequency f (n) and the standard frequency f (n + 1). In FIG. 9, the dotted line between the note number n + 6/13 and the note number n + 7/13 indicates the range of fine frequencies that are regarded as the standard frequencies. This “considered as each standard frequency” indicates that the correlation value corresponding to each standard frequency is stored in the correlation array. FIG. 9 shows that the note number n + 6/13 is regarded as the frequency range of the note number n, and the note number n + 7/13 is regarded as the note number n + 1.
[0047]
A fine periodic function having the same format as the standard periodic function shown in FIG. 2 is prepared for such a fine frequency, and a correlation value between each fine periodic function and the interval signal is calculated. Then, among the correlation values of 13 frequencies (one standard frequency and 6 fine frequencies before and after each) in the range of each standard frequency, the largest one is stored in the correlation array as the correlation value of the standard frequency. . For example, the maximum correlation value of the note number n (standard frequency f (n)) is set among 13 correlation values from the note number n−6 / 13 to the note number n + 6/13. Here, when the frequency having the maximum correlation value is located at the end of the standard frequency range such as note number n-6 / 13 or note number n + 6/13, the standard frequency adjacent to the target standard frequency is used. However, it is highly possible that the component does not originally exist within the target standard frequency range due to the influence of the adjacent standard frequency component. In order to distinguish the former from the latter, pay attention to the fine frequency of the adjacent standard frequency, and if the adjacent standard frequency is located at the end of the standard frequency range so as to overlap the target standard frequency, Yes, if it is located other than that, it is regarded as the latter. In the latter case, the correlation value of the corresponding standard frequency in the correlation array is set to “0”. In the former case, since the target standard frequency component and the adjacent standard frequency component are calculated redundantly, the correlation value of the target standard frequency or the correlation value of the adjacent standard frequency in the correlation array is set to “0”. In this embodiment, the lower correlation value is set to “0”.
[0048]
【The invention's effect】
As described above, according to the present invention, the frequency to be analyzed is set as a standard frequency, and a periodic function corresponding to each standard frequency is prepared as a standard periodic function, and a correlation value corresponding to each standard frequency is stored. And an intensity array for storing intensity values corresponding to the respective standard frequencies, and setting a plurality of unit sections on the time axis of a given time-series signal, for each unit section The time series signal is extracted as a section signal, and the correlation value corresponding to each standard frequency is calculated for the extracted section signal by calculating the correlation with multiple standard periodic functions. Stores the value, sets all values in the intensity array to 0, selects the maximum correlation value in the correlation array, adds it to the intensity value of the corresponding standard frequency in the intensity array, and selects the maximum correlation value And corresponding standard circumference The difference signal is calculated by subtracting the inclusion signal generated by the product of the functions from the interval signal, the interval signal is updated by using the calculated difference signal as a new interval signal, and the configuration of the interval signal and the inclusion signal After calculating the correlation value with the elemental standard periodic function and updating the calculated correlation value as the correlation value of the corresponding standard frequency, the standard period corresponding to the maximum correlation value in the updated correlation array Select a function, update the correlation value between the interval signal and the standard periodic function as the correlation value of the corresponding standard frequency, and repeatedly execute the addition of the intensity value, the update of the interval signal, and the update of the correlation value Is determined as an intensity value corresponding to each of all the standard frequencies in the unit section, and the determination of the intensity array is performed for all the unit sections to obtain a time series. Since to obtain the intensity sequence for all unit section No., it is possible to obtain the intensity information of each frequency component in the time-series signal more accurately. In particular, when applied to an acoustic signal as a time-series signal, there is an effect that faithful reproduction can be performed with the original sound even for an acoustic signal in which a plurality of instrument sounds are mixed.
[Brief description of the drawings]
FIG. 1 is a diagram showing a basic principle of an audio signal encoding method according to the present invention.
FIG. 2 is a diagram showing an example of a periodic function used in the present invention.
3 is a diagram showing a relational expression between the frequency of each periodic function shown in FIG. 2 and a MIDI note number n. FIG.
FIG. 4 is a diagram illustrating a method of calculating a correlation between a signal to be analyzed and a periodic signal.
FIG. 5 is a diagram showing a calculation formula for performing the correlation calculation shown in FIG. 4;
FIG. 6 is a diagram showing a basic method of generalized harmonic analysis.
FIG. 7 is a flowchart showing a frequency analysis method according to the present invention.
FIG. 8 is a diagram showing how the correlation array and the intensity array change in the present invention.
FIG. 9 is a diagram illustrating a state in which twelve fine frequencies are set between the standard frequencies.
[Explanation of symbols]
A (n), B (n) ... correlation value
d, d1 to d5 ... unit interval
E (n) ... correlation value
G (j) ... Inclusion signal
n, n1 to n6 ... note number
S (j), S (j + 1)... Difference signal
X, X (k) ... section signal

Claims

A frequency analysis method for outputting intensity information for a plurality of frequencies from an acoustic signal,
A standard periodic function preparation stage that sets a frequency to be analyzed as a standard frequency and prepares a periodic function corresponding to each standard frequency as a standard periodic function;
An array preparation stage for preparing a correlation array for storing correlation values corresponding to the respective standard frequencies and an intensity array for storing intensity values corresponding to the respective standard frequencies;
Setting a plurality of unit sections on a time axis of a given acoustic signal, and the interval signal extracting step of extracting an acoustic signal for each unit section as the section signal,
A correlation calculation step of calculating a correlation value corresponding to each standard frequency by obtaining a correlation with the plurality of standard periodic functions for the section signal, and storing each correlation value in the correlation array;
An intensity initialization step for setting all values of the intensity array to 0;
An intensity update step of selecting the largest correlation value in the correlation array and adding it to the intensity value of the corresponding standard frequency in the intensity array;
The difference signal is calculated by subtracting the inclusion signal generated by the product of the selected maximum correlation value and the corresponding standard periodic function from the interval signal, and the difference signal is used as a new interval signal to thereby calculate the interval signal. An interval signal update stage for updating
After calculating the correlation value between the interval signal and the standard periodic function that is a component of the inclusion signal, and updating the calculated correlation value as the correlation value of the corresponding standard frequency, in the updated correlation array A correlation update step of selecting a standard periodic function corresponding to the maximum correlation value and updating a correlation value between the interval signal and the standard periodic function as a correlation value of the corresponding standard frequency;
An intensity array determination step for determining an intensity array obtained by repeatedly executing the intensity update stage, the section signal update stage, and the correlation update stage as an intensity value corresponding to each of all standard frequencies in the unit section;
By determining the intensity array for all unit sections, the intensity array for all unit sections of the acoustic signal is obtained .
After the correlation calculation step calculates a correlation value corresponding to each standard frequency for the section signal, for each standard frequency in which the correlation value is stored in the correlation array, an adjacent standard adjacent to the standard frequency adjacent bearing signals obtained by the product of the adjacent correlation value corresponding to a frequency to calculate a difference signal obtained by subtracting from the interval signal, calculates a correlation value between the difference signal and the standard frequency, the standard and the correlation value If the difference between the original correlation value of the frequency is greater than a predetermined value, the frequency analysis method characterized by the correlation value corresponding to said standard frequency during the correlation sequence 0.

A frequency analysis method for outputting intensity information for a plurality of frequencies from an acoustic signal,
A standard periodic function preparation stage that sets a frequency to be analyzed as a standard frequency and prepares a periodic function corresponding to each standard frequency as a standard periodic function;
An array preparation stage for preparing a correlation array for storing correlation values corresponding to the respective standard frequencies and an intensity array for storing intensity values corresponding to the respective standard frequencies;
Setting a plurality of unit sections on a time axis of a given acoustic signal, and the interval signal extracting step of extracting an acoustic signal for each unit section as the section signal,
A correlation calculation step of calculating a correlation value corresponding to each standard frequency by obtaining a correlation with the plurality of standard periodic functions for the section signal, and storing each correlation value in the correlation array;
An intensity initialization step for setting all values of the intensity array to 0;
An intensity update step of selecting the largest correlation value in the correlation array and adding it to the intensity value of the corresponding standard frequency in the intensity array;
The difference signal is calculated by subtracting the inclusion signal generated by the product of the selected maximum correlation value and the corresponding standard periodic function from the interval signal, and the difference signal is used as a new interval signal to thereby calculate the interval signal. An interval signal update stage for updating
After calculating the correlation value between the interval signal and the standard periodic function that is a component of the inclusion signal, and updating the calculated correlation value as the correlation value of the corresponding standard frequency, in the updated correlation array A correlation update step of selecting a standard periodic function corresponding to the maximum correlation value and updating a correlation value between the interval signal and the standard periodic function as a correlation value of the corresponding standard frequency;
An intensity array determination step for determining an intensity array obtained by repeatedly executing the intensity update stage, the section signal update stage, and the correlation update stage as an intensity value corresponding to each of all standard frequencies in the unit section;
By determining the intensity array for all unit sections, the intensity array for all unit sections of the acoustic signal is obtained .
After the correlation calculation step calculates a correlation value corresponding to each standard frequency for the interval signal, for each standard frequency in which the correlation value is stored in the correlation array, a correlation value corresponding to the standard frequency calculating a difference signal by subtracting from the interval signal containing signal obtained by the product of the, it calculates a correlation value between the adjacent periodic function having adjacent standard frequencies adjacent to the differential signal and the standard frequency, the correlation value and adjacent when the difference between the original correlation value of the standard frequency is smaller than a predetermined value, the frequency analysis method, wherein a correlation value corresponding to the standard frequency is 0 in the correlation sequence.

A frequency analysis method for outputting intensity information for a plurality of frequencies from an acoustic signal,
A standard periodic function preparation stage that sets a frequency to be analyzed as a standard frequency and prepares a periodic function corresponding to each standard frequency as a standard periodic function;
An array preparation stage for preparing a correlation array for storing correlation values corresponding to the respective standard frequencies and an intensity array for storing intensity values corresponding to the respective standard frequencies;
Setting a plurality of unit sections on a time axis of a given acoustic signal, and the interval signal extracting step of extracting an acoustic signal for each unit section as the section signal,
A correlation calculation step of calculating a correlation value corresponding to each standard frequency by obtaining a correlation with the plurality of standard periodic functions for the section signal, and storing each correlation value in the correlation array;
An intensity initialization step for setting all values of the intensity array to 0;
An intensity update step of selecting the largest correlation value in the correlation array and adding it to the intensity value of the corresponding standard frequency in the intensity array;
The difference signal is calculated by subtracting the inclusion signal generated by the product of the selected maximum correlation value and the corresponding standard periodic function from the interval signal, and the difference signal is used as a new interval signal to thereby calculate the interval signal. An interval signal update stage for updating
After calculating the correlation value between the interval signal and the standard periodic function that is a component of the inclusion signal, and updating the calculated correlation value as the correlation value of the corresponding standard frequency, in the updated correlation array A correlation update step of selecting a standard periodic function corresponding to the maximum correlation value and updating a correlation value between the interval signal and the standard periodic function as a correlation value of the corresponding standard frequency;
An intensity array determination step for determining an intensity array obtained by repeatedly executing the intensity update stage, the section signal update stage, and the correlation update stage as an intensity value corresponding to each of all standard frequencies in the unit section;
By determining the intensity array for all unit sections, the intensity array for all unit sections of the acoustic signal is obtained .
The correlation calculation step sets a predetermined number of fine frequencies defined at narrow intervals between the standard frequencies, and for each standard frequency in which the correlation value is stored in the correlation array, each fine frequency, the interval signal, and of calculating the fine correlation value, among the plurality of fine frequency located near Ishirube quasi frequency range Kishirube quasi frequency before the adjacent normal frequency adjacent to said standard frequency before SL-correlation value is maximum stuff was a correlation value before Kishirube quasi frequency, when the fine frequency corresponding to the previous SL-correlation value is close to both ends of the standard frequency range, the correlation values of said standard frequency in correlation sequences to zero A characteristic frequency analysis method.

Before SL standard frequency intensity value is high from the intensity sequenced intensity sequence decision stage elected predetermined number, the height information of the sound corresponding to the elected standard frequencies, the sound corresponding to the intensity value An encoding stage for generating code data comprising four pieces of information: intensity information, a sound start time corresponding to the start point of each unit section, and a sound end time corresponding to the end point of each unit section The frequency analysis method according to any one of claims 1 to 3 , further comprising: