JP4968421B2

JP4968421B2 - Time series signal analyzer

Info

Publication number: JP4968421B2
Application number: JP2001301084A
Authority: JP
Inventors: 敏雄茂出木
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2001-09-28
Filing date: 2001-09-28
Publication date: 2012-07-04
Anticipated expiration: 2021-09-28
Also published as: JP2003108185A

Abstract

PROBLEM TO BE SOLVED: To provide a time-series signal analyzing device which can take a more accurate analysis by taking cyclic variation of a heat-sound signal into consideration. SOLUTION: A frequency analyzing means 2 generates a plurality of pieces of phoneme data composed of a pair of a frequency and intensity by taking a frequency analysis of the heart sound signal and a block phoneme generating means 3 generates a block phoneme by using the generated phoneme data. A fundamental cycle calculating means 4, on the other hand, calculates the fundamental cycles of the heart sound signal by taking an autocorrelation analysis of the heart sound signal. A block phoneme classifying means 5 classifies block phonemes which have similar frequencies and differences in start time close to integral multiples of the fundamental cycles into the same group. The classified block phonemes are outputted, group by group, to an output means 9.

Description

【０００１】
【産業上の利用分野】
本発明は、心音、心電図等の生体信号、その他音響信号を含む時系列信号の解析技術に関する。
【０００２】
【従来の技術】
音響信号に代表される時系列信号には、その構成要素として複数の周期信号が含まれている。このため、与えられた時系列信号にどのような周期信号が含まれているかを解析する手法は、古くから知られている。例えば、フーリエ解析は、与えられた時系列信号に含まれる周波数成分を解析するための方法として広く利用されている。
【０００３】
このような時系列信号の解析方法を利用すれば、音響信号を符号化することも可能である。コンピュータの普及により、原音となるアナログ音響信号を所定のサンプリング周波数でサンプリングし、各サンプリング時の信号強度を量子化してデジタルデータとして取り込むことが容易にできるようになってきており、こうして取り込んだデジタルデータに対してフーリエ解析などの手法を適用し、原音信号に含まれていた周波数成分を抽出すれば、各周波数成分を示す符号によって原音信号の符号化が可能になる。
【０００４】
一方、電子楽器による楽器音を符号化しようという発想から生まれたＭＩＤＩ（Musical Instrument Digital Interface）規格も、パーソナルコンピュータの普及とともに盛んに利用されるようになってきている。このＭＩＤＩ規格による符号データ（以下、ＭＩＤＩデータという）は、基本的には、楽器のどの鍵盤キーを、どの程度の強さで弾いたか、という楽器演奏の操作を記述したデータであり、このＭＩＤＩデータ自身には、実際の音の波形は含まれていない。そのため、実際の音を再生する場合には、楽器音の波形を記憶したＭＩＤＩ音源が別途必要になるが、その符号化効率の高さが注目を集めており、ＭＩＤＩ規格による符号化および復号化の技術は、現在、パーソナルコンピュータを用いて楽器演奏、楽器練習、作曲などを行うソフトウェアに広く採り入れられている。
【０００５】
そこで、音響信号に代表される時系列信号に対して、所定の手法で解析を行うことにより、その構成要素となる周期信号を抽出し、抽出した周期信号をＭＩＤＩデータを用いて符号化しようとする提案がなされている。例えば、特開平１０−２４７０９９号公報、特開平１１−７３１９９号公報、特開平１１−７３２００号公報、特開平１１−９５７５３号公報、特開２０００−９９００９号公報、特開２０００−９９０９２号公報、特開２０００−９９０９３号公報、特開２０００−２６１３２２号公報、特開２００１−５４５０号公報、特開２００１−１４８６３３号公報には、任意の時系列信号について、構成要素となる周波数を解析し、その解析結果からＭＩＤＩデータを作成することができる種々の方法が提案されている。
【０００６】
【発明が解決しようとする課題】
特に、特願２０００−３５１７７５号明細書においては、心音をＭＩＤＩ符号に変換し、得られたＭＩＤＩ符号をブロック化して各ブロックに心音属性を割り当てる手法について提案した。この手法では、同時に発生する複数の周波数要素を基にその時刻の統一周波数要素を算出し、時系列に連続する統一周波数要素をブロック化して、このブロックデータに心音属性を割り当てるようにしている。そのため、心音の周期的な変動により誤った心音属性を割り当ててしまう場合が生じるという問題がある。
【０００７】
上記のような点に鑑み、本発明は、心音信号の周期的な変動を考慮して、より正確な解析を行なうことが可能な時系列信号解析装置を提供することを課題とする。
【０００８】
【課題を解決するための手段】
上記課題を解決するため、本発明では、時系列信号解析装置として、与えられた時系列信号に対して設定された所定の単位区間ごとに周波数および強度の組で構成される複数の音素データを生成する周波数解析手段と、生成された前記複数の音素データのうち、所定の強度を満たすものを抽出し、抽出された複数の音素データの開始時刻・終了時刻・周波数の類似性を基に、開始時刻・終了時刻・周波数・強度のパラメータで構成されるブロック音素を作成するブロック音素作成手段と、前記時系列信号に対して自己相関解析を行なうことにより、当該時系列信号の基本周期を算出する基本周期算出手段と、前記複数のブロック音素の中で周波数が類似しており、かつ開始時刻の差が前記算出した基本周期の整数倍に近いものを、同一のグループになるように分類するブロック音素分類手段と、前記グループ化された各ブロック音素に対して、所定のデータベースで定義された属性情報を付与する属性情報付与手段と、前記属性情報が付与されたブロック音素を出力するための出力手段を有する構成とし、前記属性情報付与手段が、前記データベースに合致するブロック音素に対して属性情報を付与するとともに、当該ブロック音素と同一グループに属する全てのブロック音素に対して、前記ブロック音素に付与された属性情報と同一の属性情報を付与するものとしたことを特徴とする。
【０００９】
本発明によれば、時系列信号の解析を行なって得た複数の音素データの類似性に基づいてブロックデータを作成する一方、時系列信号に対して自己相関解析を行なって時系列信号の周期を算出し、作成したブロックのうち、周期間隔で離れているものを同一のグループとして、全ブロックデータを複数のブロックに分類し、心音波形を各ブロックの特性と周期を含めて心音属性と対応付けて登録した知識データベースを利用することにより、心音属性の付与までを自動的に行うことが可能となる。
【００１０】
【発明の実施の形態】
以下、本発明の実施形態について図面を参照して詳細に説明する。
（1.時系列信号の解析の基本原理）
はじめに、時系列信号の解析に一部利用される周波数解析の基本原理を述べておく。この基本原理は、前掲の各公報あるいは明細書に開示されているので、ここではその概要のみを簡単に述べることにする。図１の上段に示すように、時系列の強度信号としてアナログ音響信号が与えられたものとする。図示の例では、横軸に時間軸ｔ、縦軸に信号強度Ａをとってこの音響信号を示している。本発明では、まずこのアナログ音響信号を、デジタルの音響信号として取り込む処理を行う。これは、従来の一般的なＰＣＭの手法を用い、所定のサンプリング周波数でこのアナログ音響信号をサンプリングし、信号強度Ａを所定の量子化ビット数を用いてデジタルデータに変換する処理を行えばよい。ここでは、説明の便宜上、ＰＣＭの手法でデジタル化した音響信号の波形も、図１の上段のアナログ音響信号と同一の波形で示すことにする。
【００１１】
次に、このデジタル音響信号の時間軸ｔ上に複数の単位区間を設定する。図示の例では、６つの単位区間Ｕ１〜Ｕ６が設定されている。第ｉ番目の単位区間Ｕｉは、時間軸ｔ上の始端ｓｉおよび終端ｅｉの座標値によって、その時間軸ｔ上での位置と長さとが示される。たとえば、単位区間Ｕ１は、始端ｓ１〜終端ｅ１までの（ｅ１−ｓ１）なる長さをもつ区間である。
【００１２】
こうして、複数の単位区間が設定されたら、個々の単位区間内の音響信号に基づいて、個々の単位区間を代表する所定の代表周波数および代表強度を定義する。ここでは、第ｉ番目の単位区間Ｕｉについて、代表周波数Ｆｉおよび代表強度Ａｉが定義された状態が示されている。たとえば、第１番目の単位区間Ｕ１については、代表周波数Ｆ１および代表強度Ａ１が定義されている。代表周波数Ｆ１は、始端ｓ１〜終端ｅ１までの区間に含まれている音響信号の周波数成分の代表値であり、代表強度Ａｉは、同じく始端ｓ１〜終端ｅ１までの区間に含まれている音響信号の信号強度の代表値である。単位区間Ｕ１内の音響信号に含まれる周波数成分は、通常、単一ではなく、信号強度も変動するのが一般的である。従来の手法とも共通する基本原理のポイントは、１つの単位区間について、１つもしくは複数の代表周波数と代表強度を定義し、これら代表値を用いて符号化を行う点にある。
【００１３】
すなわち、個々の単位区間について、それぞれ代表周波数および代表強度が定義されたら、時間軸ｔ上での個々の単位区間の始端位置および終端位置を示す情報と、定義された代表周波数および代表強度を示す情報と、により符号データを生成し、個々の単位区間の音響信号を個々の符号データによって表現するのである。単一の周波数をもち、単一の信号強度をもった音響信号が、所定の期間だけ持続する、という事象を符号化する手法として、ＭＩＤＩ規格に基づく符号化を利用することができる。ＭＩＤＩ規格による符号データ（ＭＩＤＩデータ）は、いわば音符によって音を表現したデータということができ、図１では、下段に示す音符によって、最終的に得られる符号データの概念を示している。
【００１４】
結局、各単位区間内の音響信号は、代表周波数Ｆ１に相当する音程情報（ＭＩＤＩ規格におけるノートナンバー）と、代表強度Ａ１に相当する強度情報（ＭＩＤＩ規格におけるベロシティー）と、単位区間の長さ（ｅ１−ｓ１）に相当する長さ情報（ＭＩＤＩ規格におけるデルタタイム）と、をもった符号データに変換されることになる。このようにして得られる符号データの情報量は、もとの音響信号のもつ情報量に比べて、著しく小さくなり、飛躍的な符号化効率が得られることになる。なお、代表周波数をノートナンバーｎの関数ｆ（ｎ）としたときに、代表周波数ｆ（ｎ）とノートナンバーｎ（０≦ｎ≦１２７）との関係は、以下に示す〔数式１〕で定義されることになる。
【００１５】
〔数式１〕
ｆ（ｎ）＝４４０×２^γ ⁽ⁿ⁾
γ（ｎ）＝（ｎ−６９）／１２
【００１６】
また、周波数と周期は互いに逆数の関係にあり、一方が定まれば他方も一義的に定まり、上記〔数式１〕により周波数が定まればノートナンバーも定まる。したがって、本発明においては、周波数・周期・ノートナンバーのいずれか１つの用語を用いている場合においても、他の２つについても同時に定まっているものとして説明していく。
【００１７】
上記のような各単位区間について、代表周波数および強度の組を算出する具体的な手法としては、短時間フーリエ解析、一般化調和解析手法、ゼロ交差点検出法等の周知の手法が利用できる。
【００１８】
（2.本発明に係る時系列信号の解析方法）
続いて、本発明に係る時系列信号解析の具体的な手法について説明する。図２は本発明による時系列信号の解析の概略を示すフローチャートである。図２に示すように、ＰＣＭの形式で取り込まれたデジタルの音響信号に対しては２通りの処理が行われる。まず、時系列信号の周波数解析を行なう（ステップＳ１）。ここで、時系列信号として取り込んだ心音波形の一例を図３に示す。ステップＳ１では、図３に示したような心音波形（信号）に対して、上記基本原理で図１を用いて説明したように単位区間を設定し、各単位区間について複数の周波数および強度の組を算出する。
【００１９】
周波数および強度の組の算出の具体的な手法としては、上述のように短時間フーリエ変換、一般化調和解析、ゼロ交差点検出法が利用できるが、本発明においては、ゼロ交差点検出法を適用するのが効果的である。特にステップＳ１における周波数解析においては、ゼロ交差点検出法を発展させた手法により周波数および強度の組の算出を行っている。このステップＳ１における周波数解析の詳細について図４のフローチャートを利用して説明する。
【００２０】
まず、入力された原音響信号に対して信号から主ゼロ交差点を検出する（ステップＳ１１）。主ゼロ交差点とは、音響信号が０レベルになる地点であるゼロ交差点のうち、補正処理が行われていない原音響信号に対して検出されるものを示す。ここで、図３に示した原音響信号の一部を時間軸方向に拡大したものを図５（ａ）に示す。ステップＳ１１においては、信号が０レベルとなるゼロ交差点を検出する。このゼロ交差点は、時刻で表現されることになる。図５（ａ）の原音響信号から検出された主ゼロ交差点を図５（ｂ）に示す。
【００２１】
次に、ステップＳ１１において検出された主ゼロ交差点にしたがって、単位区間を決定する（ステップＳ１２）。単位区間とは、周波数成分の抽出および符号データを作成するための基本となる区間であり、図１の単位区間Ｕに相当する。単位区間は、始点となる主ゼロ交差点ｔ１から終点となる主ゼロ交差点ｔ２までの区間として設定される。始点となる主ゼロ交差点ｔ１としては、単位区間として未だ設定されていない区間において、時間的に先頭の主ゼロ交差点が選出される。終点となる主ゼロ交差点ｔ２は、以下のようにして選出される。
【００２２】
図５（ａ）に示すように原音響信号の波形は、始点となるゼロ交差点から２つ移動した位置のゼロ交差点で１周期となっている。そのため、この整数倍のゼロ交差点を移動した位置を終点とするのが好ましい。また、安定した周波数値を得るため、終点をできるだけ始点から離れた位置にとり、その平均周期をこの単位区間の周期とすることが好ましい。そのために、まず始点となる主ゼロ交差点ｔ１から２つ移動した主ゼロ交差点までの距離を単位周期Ｔ_Bと設定する。終点となる主ゼロ交差点ｔ２としては、始点となる主ゼロ交差点ｔ１から２の整数倍だけ離れた主ゼロ交差点のうち以下の〔数式２〕を満たす範囲で、ｔ２−ｔ１が最大となるものを選出するようにする。
【００２３】
〔数式２〕
｜Ｎ_A−Ｎ_B｜ ≦ １
ただし、Ｎ＝４０×ｌｏｇ₁₀（１／４４０Ｔ）＋６９
Ｔ＝（ｔ２−ｔ１）×２／ｋ
【００２４】
上記〔数式２〕において、Ｎ_Aはｔ２−ｔ１間の平均周期Ｔ_Aにより定まるノートナンバー、Ｎ_Bは単位周期Ｔ_Bにより定まるノートナンバーである。また、ｋは、始点となる主ゼロ交差点から終点となる主ゼロ交差点までに移動した主ゼロ交差点の数であり、２の整数倍となっている。なお、上記〔数式２〕において、第２式は、周期ＴからノートナンバーＮを求めるためのものであり、第３式は、平均周期Ｔ_Aを求めるためのものであり、ｋ＝２とした場合に単位周期Ｔ_Bが得られる。上記〔数式２〕においては、終点となる主ゼロ交差点ｔ２を、始点となる主ゼロ交差点ｔ１から、４個分、６個分、８個分…というように２の整数倍だけ離れた主ゼロ交差点に仮設定して判断していく。結局、ステップＳ１２における処理は、単位周期Ｔ_Bで定まる周波数と、平均周期Ｔ_Aで定まる周波数をそれぞれノートナンバーに変換した際に、半音の差（ノートナンバーとしては±１の差）以内に収まるような主ゼロ交差点ｔ２を終点として選出する処理を行うことになる。単位区間の設定は音響信号の全区間に渡って行われる。後続する単位区間においては、先行する単位区間の終点となる主ゼロ交差点の次の主ゼロ交差点を、始点となる主ゼロ交差点として、上記説明と同様に処理を行っていく。
【００２５】
全単位区間の設定が行われたら、各単位区間について、主周波数成分の算出を行う（ステップＳ１３）。周波数成分とは、周期（周波数）および振幅を意味するものである。周期としては、単位区間における平均周期Ｔ_Aが設定され、振幅としては、単位区間内において、信号の絶対値が最大となる値Ａが設定される。なお、主周波数成分とは、周波数成分のうち、補正が行われていない原音響信号から抽出されたものをいい、ステップＳ１６で算出される副周波数成分と区別している。
【００２６】
上記ステップＳ１２およびステップＳ１３による処理、すなわち、ゼロ交差点を用いて単位区間を決定し、単位区間に対応する周波数成分を算出する処理については、従来から行われているものと同様である。ただし、これでは、１つの単位区間について１つの周波数成分しか抽出できない。本発明では、１つの単位区間から複数の周波数成分を抽出するために後述するようなステップＳ１４以下の処理を行っている。
【００２７】
ステップＳ１３において主周波数成分が算出されたら、次にその単位区間における音響信号から直流成分の除去を行う（ステップＳ１４）。ここでの、直流成分の除去は単位区間全体に渡って一律に行うのではなく、音響信号が０以上になっている部分については、当該部分の平均強度を各時刻の信号強度から減じるようにし、音響信号が０以下になっている部分についても、当該部分の平均強度を正負の符号をそのままにして信号強度から減じる（実際には平均強度は増加していることになる）ようにする。これにより、振幅の大きな部分が抑えられた補正音響信号が得られることになる。図５（ｃ）に示した原音響信号を補正した補正音響信号を図５（ｄ）に示す。
【００２８】
続いて、補正音響信号に対して、信号が０レベルと交差する副ゼロ交差点を検出する（ステップＳ１５）。これは、上記ステップＳ１１において、原音響信号に対して主ゼロ交差点を検出したのと全く同様な処理で行われる。ただし、図５（ｄ）に示したように、補正音響信号では、原音響信号に比べて明らかにゼロ交差点が増えているので、多くの副ゼロ交差点が検出されることになる。図５（ｄ）に示した補正音響信号から検出された副ゼロ交差点を図５（ｅ）に示す。
【００２９】
続いて、補正音響信号および検出された副ゼロ交差点に従って、副周波数成分の算出を行う（ステップＳ１６）。副周波数成分のうち、周期としては、単位区間における始点となる副ゼロ交差点ｔ３と終点となる副ゼロ交差点ｔ４で定まる平均周期が与えられる。また、副周波数成分のうち、振幅としては、単位区間内において、補正音響信号の絶対値が最大となる値が設定される。
【００３０】
上記ステップＳ１４〜ステップＳ１６の処理は、所定の回数だけ繰り返され、繰り返された回数分の副周波数成分が抽出されることになる。上記のようにして、各単位区間ごとに複数の周波数に対応する強度を検出でき、原音響信号の全区間に渡って処理することにより、設定された全単位区間について、周波数および強度の組を算出できる。
【００３１】
各単位区間の先頭を開始時刻、終点を終了時刻とすると、各単位区間ごとに開始時刻・終了時刻・周波数・強度の組が得られることになる。この開始時刻・終了時刻・周波数・強度の組を本明細書では、音素データと呼ぶことにする。ステップＳ１においては、その強度が所定値に達しない音素データを削除する処理も行う。その結果、残った音素データの様子を図６に示す。
【００３２】
図６においては、周波数を音の高さ、強度を音の強さとして、音素データを下向きの三角形で示している。音の高さは三角形の上辺の上下方向における位置で表されており、音の強さは三角形の高さで表されている。また、単位区間の長さは三角形の上辺の長さで表されている。ステップＳ１の周波数解析において選出される音素データは各単位区間について１６個程度とすることが好ましいが、図面が繁雑になるのを避けるため図６の例では各単位区間について３個ずつで示している。
【００３３】
各単位区間について音素データが選出されたら、選出された音素データに基づいてブロック音素を作成する（ステップＳ２）。ブロック音素の作成処理は、時間軸上で近接する音素データで、ノートナンバーおよびベロシティが類似しているものを連結させることにより行なわれる。そして、ブロック音素のノートナンバーとして各構成音素データの平均値が、ベロシティとして各構成音素データの最大値が与えられる。各単位区間については周波数解析により１６個程度の複数の音素データが抽出されることが通常であるが、その場合は時間軸に重複した複数のブロック音素が作成される。ただし、後段の属性情報付けが煩雑になるため、時間的に重複するブロック音素はベロシティが最も大きいものだけで代表させて１つに統合させる方法もとられる。以下はその具体的手法として、特願２０００−３５１７７５号明細書にも記載されている手法を示す。各単位区間ごとに統一周波数Ｎｒ（ｔ）および統一強度Ｖｒ（ｔ）を以下の〔数式３〕により算出することにより行われる。
【００３４】
〔数式３〕
Ｎｒ（ｔ）＝［Σ｛ｎ×Ｖｎ（ｔ）｝］／ΣＶｎ（ｔ）
Ｖｒ（ｔ）＝［Σ｛Ｖｎ（ｔ）｝²］^1/2
【００３５】
上記〔数式３〕において、ｎはノートナンバーであり、Ｖｎ（ｔ）はノートナンバーｎに対応するベロシティである。また、Σにより算出される総和は音素データ分行なわれる。すなわち、１６個の音素データが選出された場合は、１６個分の総和が算出されることになる。〔数式３〕により統一周波数Ｎｒ（ｔ）としては、各音素データを周波数軸と強度軸からなるグラフ上にプロットしたときの重心位置の周波数が算出されることになり、統一強度Ｖｒ（ｔ）としては、強度分布の二乗平均値が算出されることになる。このようにして各単位区間ごとに統一周波数および統一強度を有するブロック音素が得られる。さらに、隣接するブロック音素間において、統一周波数が類似する場合には、複数の単位区間にまたがって統合され、新たなブロック音素が得られる。統一周波数が類似するとは、あらかじめ設定された範囲内に周波数の差が含まれる場合である。統合後のブロック音素の周波数、強度としては、統合前のブロック音素の周波数、強度の平均値が与えられる。図６に示した音素データに対してステップＳ２の処理により得られたブロック音素の様子を図７に示す。図７の例では、音の高さ・音の強さは無視して表現している。
【００３６】
上記ステップＳ１・Ｓ２の処理とは別に、入力された音響信号に対しては、基本周期の算出を行う（ステップＳ３）。これは、自己相関解析を行うことにより行われる。具体的には、与えられた音響信号ｇ（ｔ）に対して、周期Ｔ_nを変化させて以下の〔数式４〕により相関値Ｒ（Ｔ_n）が最大となる周期Ｔ_nを求める。
【００３７】
〔数式４〕
Ｒ（Ｔ_n）＝Σ_tｇ（ｔ）×ｇ（ｔ＋Ｔ_n）
【００３８】
周期Ｔ_nとしては、各ノートナンバー（ｎ＝０〜１２７）に対応する１２８通りが与えられる。図８は、入力音響信号ｇ（ｔ）と周期Ｔ_n分ずらした音響信号ｇ（ｔ＋Ｔ_n）の関係を示す図である。図８（ａ）は入力音響信号ｇ（ｔ）を示し、図８（ｂ）は音響信号ｇ（ｔ＋Ｔ₁₂₇）、図８（ｃ）は音響信号ｇ（ｔ＋Ｔ₁₂₆）、図８（ｄ）は音響信号ｇ（ｔ＋Ｔ_n）を示している。このような音響信号ｇ（ｔ＋Ｔ_n）のうち、原音響信号ｇ（ｔ）との相関値Ｒ（Ｔ_n）が最大となるときのＴ_nを入力音響信号ｇ（ｔ）の周期とみなす。
【００３９】
続いて、上記ステップＳ２で求めたブロック音素を、ステップＳ３で求めた周期Ｔ_nに基づいて分類する（ステップＳ４）。具体的には、互いに周期Ｔ_nだけ離れたブロック音素のノートナンバーの比較を行う。このノートナンバーの差が設定された所定値の範囲に収まる場合に両ブロックを同一のチャンネルに分類する。分類するチャンネル数は、ブロック音素の数や、同一とみなすノートナンバーの差の範囲により異なるが、最終的な出力データのチャンネル数よりも多めのチャンネルに分けるようにする。例えば、図７に示したようなブロックデータについては、図９に示すように８つのチャンネルに分類される。
【００４０】
次に、分類された各チャンネルのブロック音素に心音属性を付与する（ステップＳ５）。具体的には、あらかじめ、心音に関するデータベースを用意しておき、このデータベースに記憶されているブロック音素パターンと、分類された各チャンネルのブロック音素パターンとを比較し、類似していると判断された場合に、データベースに登録されている心音属性を、そのチャンネルに対して付与する。ブロック音素パターンとは、各ブロック音素を作成する基になった音素データの配列パターンである。各ブロック音素に対しては、具体的には、I音、II音、その他の３種類の属性情報を与える。この判定は、次の４つのルールに基づいて行なわれる。
【００４１】
音の高さ：II音はI音に比べて音高が高い。（II音＞I音）
音の長さ：I音およびII音とも音の長さが短く固定しており、II音はI音より短い。（I音およびII音＜長さ閾値、かつI音＞II音）
音の強さ：I音およびII音とも他の成分に比べて強く、聴取するセンサの位置によりI音とII音のバランスが変化する。例外として、逆流性心雑音などはI音およびII音より強い場合がある。（I音およびII音＞強さ閾値）
音の間隔：I音とII音の間隔はII音とI音の間隔に比べて短い。心拍数が増えるとII音とI音の間隔は短くなるが、I音とII音の間隔は変化しない、すなわちI音とII音の間隔に近づく。（I-II間隔＜II-I間隔）
【００４２】
この結果、図９に示したブロック音素には、各チャンネルごとに、図１０に示すような属性情報が付与されることになる。続いて、さらに詳細な情報を記録したデータベースを参照して各チャンネルに対して対応する属性情報の付与を行なう。ブロック音素を作成する基になった音素データの配列パターンに基づき、まず、I音およびII音の属性情報が付与されたブロック音素については、その詳細分類を、その他の属性情報が付与されたブロック音素については、具体的な属性情報を、その音素データの配列パターンから、詳細属性情報が登録されたデータベースを参照してIII音、IV音（またはIIIとIVの重合音）、心雑音（収縮期か拡張期かまたは連続性か、逆流性か駆動性かといったサブ分類も定義されている）、クリック音、摩擦音などのいずれであるかが決定され、該当する詳細属性情報が付与される。特に第I音については、僧帽弁成分・三尖弁成分・大動脈弁成分のいずれであるか、第II音については、大動脈弁成分・肺動脈弁成分のいずれであるかを特定することができる。各チャンネルごとに詳細の属性情報を付与したブロック音素を図１１に示す。
【００４３】
上記ステップＳ５において、各チャンネルのデータが、どのような心音属性であるかを特定することができるが、自動で行わず、人が特定するようにすることもできる。この場合、図９に示すようなブロック音素が複数のチャンネルに分類された状態が画面等に表示される。図９の例では、ブロック化された状態を示しているが、実際には、図６に示したような音素データを重ねた状態が表示される。そのような表示を見ると、各ブロック音素を構成する音素データの状態と、周期の両方を同時に確認することができる。
【００４４】
医者等の心音波形による心音属性の判断が可能な者は、この周波数要素を見ながら、各チャンネルに対して心音属性の割り当てを行う。図９に示すように、従来の心音波形と異なり、所定の周期間隔で現れるブロック音素についても、同一チャンネルに現れるので、より正確な判断が可能となる。例えば、１つのブロック音素だけを見ると、心音成分であると判断されそうなものであっても、周期的に現れていないものはノイズであると判断して、除外することができる。
【００４５】
続いて、属性情報が付与されたブロック音素をさらに構造文書化する（ステップＳ６）。構造化文書としてＸＭＬ（eXtensible Markup Language）規格を採用した場合のソースコードの一例を図１２に示す。図１２に示す例は、図１１に示したブロック音素の先頭から心音の１周期分を構造文書化したものであり、３７行で記述されている。なお、図１２に記述されているブロック音素はＭＩＤＩに準拠して記述されている。ＸＭＬは基本的に１対のタグで実データが囲まれるような形式となっている。図１２において、先頭の１行目と最終の３７行目の一対のタグにより、文書の開始と終了が定義されている。２行目に記述された一対のタグでは、この文書の開始から終了までに記載されているＭＩＤＩ規格準拠のイベントデータ（<Event>タグで定義）に定義されている発音開始時刻（<StartTime>タグで定義）および発音終了時刻（<EndTime>タグで定義）の時間の単位（１秒あたりの分解能）が定義されている。３行目の<HeartCycle>タグと３６行目の終了タグ</HeartCycle>は心音の一周期分のデータが以下記載されていることを指示している。４行目から１４行目は心音第I音<FirstSound>を記述したものであり、４行目と１４行目は心音第I音であることを示す一対のタグである。この心音第I音は、さらに、３つの詳細属性に分類されている。５行目から７行目は僧帽弁成分Ｍ１を、８行目から１０行目は三尖弁成分Ｔ１を、１１行目から１３行目は大動脈弁成分を、それぞれ示している。各分類は、同様の構成となっており、例えば、僧帽弁成分Ｍ１の場合、６行目と７行目はそれぞれ統一符号データをＭＩＤＩ規格に準拠して記述したものである。例えば、６行目の<StartTime>、<EndTime>はＭＩＤＩ規格ではデルタタイムという相対時刻で表現されている時刻を絶対時刻で記述しており、時刻「１０」から時刻「２０」まで発音されることを示している。また、７行目の<Pitch>はＭＩＤＩ規格のノートナンバーに対応しており、ノートナンバー「３２」に対応する音高で発音されることを示している。７行目の<Level>はＭＩＤＩ規格のベロシティに対応しており、ベロシティ「６０」に対応する音の強さで発音されることを示している。
【００４６】
作成された構造化文書がＸＭＬ形式である場合は、インターネット・ブラウザで閲覧できると共に、専用プレイヤーソフトウェアとＭＩＤＩ音源を用いて心音を音響信号として再生することができる。具体的には、作成された構造化文書をＷＷＷサーバに登録しておき、ユーザは自分のパソコンでブラウザを起動してインターネットでＷＷＷサーバにアクセスし、ＸＭＬ文書を得る。次に、ブラウザにプラグインされているＭＩＤＩシーケンサソフトが、ＸＭＬ文書に記録されているＭＩＤＩデータに従ってＭＩＤＩ音源を制御しながら、音響信号の再生を行なう。このように心音を符号化してＸＭＬ形式で記録することは、インターネットを介して流通するのに便利であったり、ＭＩＤＩ音源を用いて再生するのに適しているだけでなく、近年ＸＭＬ形式を用いて電子カルテとして電子化が行なわれているカルテ情報との整合性の観点から見ても有効なものである。
【００４７】
心音の属性が付与されたＭＩＤＩデータを電子カルテとして保存すると、ＭＩＤＩ音源で心音を再生できるという前述の効果以外に、以下で述べるように文書から診断に必要な定量的病態を読み取ることができるという効果がある。例えば、I音とII音などには亢進、減弱、分裂という病態があるが、これらの病態は属性で囲まれたＭＩＤＩイベントデータの数値から判断できる。すなわち、<Level>タグで囲まれたベロシティ（強さ）または<Pitch>タグで囲まれたノートナンバー（高さ）が所定値より高ければ亢進であり、所定値より低ければ減弱である。そして、図１２のI音およびII音の例では時間軸上近接する２つのイベント（三角形で表現された音符）で構成されているが、これらが時間的に離れれば分裂と判断できる。すなわち前のイベントの<EndTime>タグの値と後のイベントの<StartTime>タグの値との差が所定値以上であれば分裂と判断できる。また心雑音の臨床診断でしばしばレビン（Levine）の分類（クラス１〜６まであり、１は聴診では聴取が困難で心音図でないと判別できない程度の弱い雑音で、６は聴診器をあてなくても側にいるだけで聞こえる強い雑音）が用いられるが、これも<Level>タグで囲まれたベロシティ（強さ）から判別できる。
【００４８】
次に上述した本発明に係る時系列信号の解析方法を実現するための装置について説明する。図１３は、時系列信号解析装置の一実施形態を示す機能ブロック図である。図１３において、１は時系列信号取得手段、２は周波数解析手段、３はブロック音素作成手段、４は基本周期算出手段、５はブロック音素分類手段、６は属性情報付与手段、７は属性データベース、８は詳細属性データベース、９は出力手段、１０は構造化文書作成手段である。図１３に示す時系列信号解析装置は現実にはコンピュータにオーディオ関係の周辺機器を接続し、専用のプログラムを搭載することにより実現される。
【００４９】
時系列信号取得手段１は、時系列信号を取得するためのものであり、心音信号を取得する場合には、聴診器にマイクを取り付け、取得した心音をＰＣＭ等のデジタル信号に変換するサンプリング機器により実現される。周波数解析手段２は取得された時系列信号の周波数解析を行なって音素データを作成する機能を有し、具体的には、図２のステップＳ１の処理をコンピュータプログラムにしたがって実行する。ブロック音素作成手段３は、周波数解析手段２により得られる音素データを基にブロック音素を作成する機能を有し、具体的には、図２のステップＳ２の処理をコンピュータプログラムにしたがって実行する。基本周期算出手段４は、時系列信号の自己相関解析を行って基本周期を算出する機能を有し、具体的には、図２のステップＳ３の処理をコンピュータプログラムにしたがって実行する。ブロック音素分類手段５は、ブロック音素作成手段３により作成されたブロック音素と、基本周期算出手段４により算出された基本周期を基にブロック音素を複数のグループに分類する機能を有し、具体的には、図２のステップＳ４の処理をコンピュータプログラムにしたがって実行する。属性情報付与手段６は、あらかじめ設定されたルールにしたがって、ブロック音素が分類された各チャンネルに対して心音の属性情報を付与すると共に、ブロック音素を作成する基になった音素データの配列パターンに基づいて、詳細属性データベース８から抽出した詳細属性情報をさらに付与する機能を有する。具体的には、図２のステップＳ５の処理をコンピュータプログラムにしたがって実行することになる。属性データベース７は、第I音、第II音、その他を分類するために、音素データの配列パターンと各心音属性の対応を記録したデータベースであり、詳細属性データベース８は、第I音、第II音をさらに詳細に特定するため、第I音については、僧帽弁成分・三尖弁成分・大動脈弁成分のそれぞれと対応する音素データの配列パターンを記録し、第II音については、大動脈弁成分・肺動脈弁成分のそれぞれと対応する音素データの配列パターンを記録しており、さらに「その他」と分類された心音を、第III音、第IV音（またはIIIとIVの重合音）、心雑音（収縮期か拡張期かまたは連続性か、逆流性か駆動性かといったサブ分類も定義されている）、クリック音、摩擦音に対応するものに特定するための音素データの各配列パターンを記録している。出力手段９は分類されたブロック音素を表示・印刷する機能を有する。ここで、出力されたブロック音素は、上記図２のステップＳ５において、心音属性を自動付与するのではなく、医療関係者が心音属性を付与するのに役立つ。チャンネルごとに分類されたブロック音素を見ることにより、基本周期を考慮した心音属性の付与が可能となる。具体的に出力手段９は、ＣＲＴ、液晶等のディスプレイ装置、モノクロまたはカラーの各種印刷方式のプリンタ装置で実現される。構造化文書作成手段１０は、心音の属性情報が付与されたブロック音素をＸＭＬ規格等のテキスト情報に変換する機能を有しており、具体的には、図２のステップＳ６の処理をコンピュータプログラムにしたがって実行する。
【００５０】
以上、本発明の好適な実施形態について説明したが、本発明は上記実施形態に限定されず種々の変形が可能である。上記実施形態では、時系列信号として心音を解析する場合について説明したが、呼吸音などその他生体音響信号、心電図など非音響の生体時系列信号の解析にも適用でき、その他の分野では、例えばエンジン音などの音響信号を時系列信号として解析し、自動車のエンジンの故障診断などに適用することも可能である。
【００５１】
【発明の効果】
以上、説明したように本発明によれば、与えられた時系列信号に対して設定された所定の単位区間ごとに時系列信号に対して周波数解析を行うことにより、各単位区間に対して周波数および強度の組で構成される複数の音素データを作成し、作成された前記複数の音素データのうち、所定の強度を満たすものを抽出し、抽出された複数の音素データの開始時刻・終了時刻・周波数の類似性を基に、開始時刻・終了時刻・周波数・強度のパラメータで構成されるブロック音素を作成し、時系列信号に対して自己相関解析を行なうことにより、この時系列信号の基本周期を算出し、複数のブロック音素の中で周波数が類似しており、かつ開始時刻の差が、算出した基本周期の整数倍に近いものを、同一のグループになるように分類し、分類されたグループ単位で、心音波形を各ブロックの特性と周期を含めて心音属性と対応付けて登録した知識データベースを利用することにより、心音属性の付与までを自動的に行うことが可能となる。
【図面の簡単な説明】
【図１】本発明の時系列信号解析装置における信号解析の基本原理を示す図である。
【図２】本発明による時系列信号の解析の概略を示すフローチャートである。
【図３】時系列信号として取り込んだ心音波形の一例を示す図である。
【図４】図２のステップＳ１の詳細を示すフローチャートである。
【図５】音響信号における単位区間の設定および周期・振幅の決定を説明するための図である。
【図６】図３に示した心音波形から周波数解析により生成された音素データを示す図である。
【図７】図６に示した音素データを基に算出されたブロック音素を示す図である。
【図８】入力音響信号ｇ（ｔ）と周期Ｔ_n分ずらした音響信号ｇ（ｔ＋Ｔ_n）の関係を示す図である。
【図９】図７に示したブロック音素を、基本周期に基づいて分類した状態を示す図である。
【図１０】図９に示した各ブロック音素に対して属性情報を付与した状態を示す図である。
【図１１】図１０に示した各ブロック音素に対してさらに詳細な属性情報を付与した状態を示す図である。
【図１２】構造化文書としてＸＭＬ規格を採用した場合のソースコードを示す図である。
【図１３】本発明による時系列信号解析装置の一実施形態を示す機能ブロック図である。
【符号の説明】
１・・・時系列信号取得手段
２・・・周波数解析手段
３・・・ブロック音素作成手段
４・・・基本周期算出手段
５・・・ブロック音素分類手段
６・・・属性情報付与手段
７・・・属性データベース
８・・・詳細属性データベース
９・・・出力手段
１０・・・構造化文書作成手段[0001]
[Industrial application fields]
The present invention relates to a technique for analyzing time-series signals including biological signals such as heart sounds and electrocardiograms and other acoustic signals.
[0002]
[Prior art]
A time-series signal represented by an acoustic signal includes a plurality of periodic signals as its constituent elements. For this reason, a method for analyzing what kind of periodic signal is included in a given time-series signal has been known for a long time. For example, Fourier analysis is widely used as a method for analyzing frequency components included in a given time series signal.
[0003]
By using such a time-series signal analysis method, an acoustic signal can be encoded. With the spread of computers, it has become easy to sample an analog audio signal as the original sound at a predetermined sampling frequency, quantize the signal intensity at each sampling, and capture it as digital data. If a method such as Fourier analysis is applied to the data and the frequency components included in the original sound signal are extracted, the original sound signal can be encoded by a code indicating each frequency component.
[0004]
On the other hand, the MIDI (Musical Instrument Digital Interface) standard, which was born from the idea of encoding musical instrument sounds by electronic musical instruments, has been actively used with the spread of personal computers. The code data according to the MIDI standard (hereinafter referred to as MIDI data) is basically data that describes the operation of the musical instrument performance such as which keyboard key of the instrument is played with what strength. The data itself does not include the actual sound waveform. Therefore, when reproducing the actual sound, a MIDI sound source storing the waveform of the instrument sound is separately required. However, its high encoding efficiency is attracting attention, and encoding and decoding according to the MIDI standard are being attracted attention. This technology is now widely used in software that uses a personal computer to perform musical instrument performance, practice and compose music.
[0005]
Therefore, by analyzing a time-series signal represented by an acoustic signal by a predetermined method, a periodic signal as a constituent element is extracted, and the extracted periodic signal is encoded using MIDI data. Proposals have been made. For example, JP-A-10-247099, JP-A-11-73199, JP-A-11-73200, JP-A-11-95753, JP-A-2000-99009, JP-A-2000-99092, JP-A-2000-99093, JP-A-2000-261322, JP-A-2001-5450, and JP-A-2001-148633 analyze the frequency as a component of an arbitrary time-series signal, Various methods for creating MIDI data from the analysis results have been proposed.
[0006]
[Problems to be solved by the invention]
In particular, Japanese Patent Application No. 2000-351775 proposed a method for converting a heart sound into a MIDI code, blocking the obtained MIDI code, and assigning a heart sound attribute to each block. In this method, a uniform frequency element at that time is calculated based on a plurality of frequency elements that are generated at the same time, the uniform frequency elements that are continuous in time series are blocked, and a heart sound attribute is assigned to the block data. Therefore, there is a problem that an incorrect heart sound attribute may be assigned due to periodic fluctuations of the heart sound.
[0007]
In view of the above points, it is an object of the present invention to provide a time-series signal analyzing apparatus capable of performing more accurate analysis in consideration of periodic fluctuations of a heart sound signal.
[0008]
[Means for Solving the Problems]
  In order to solve the above-mentioned problem, in the present invention, as a time-series signal analyzing apparatus, a plurality of phoneme data composed of a set of frequency and intensity for each predetermined unit section set for a given time-series signal is obtained. Of the generated plurality of phoneme data, those that satisfy a predetermined intensity are extracted, and based on the similarity of the start time, end time, and frequency of the extracted plurality of phoneme data, Block phoneme creation means for creating block phonemes composed of parameters of start time, end time, frequency, and intensity, and autocorrelation analysis on the time series signal to calculate the basic period of the time series signal The basic period calculation means that performs the same group of the plurality of block phonemes and the difference in start time is close to an integral multiple of the calculated basic period. And the block phoneme classifying means for classifying to be flop,Attribute information giving means for giving attribute information defined in a predetermined database to each grouped block phoneme, and output means for outputting the block phoneme to which the attribute information is givenAnd having a configurationThe attribute information giving means gives attribute information to a block phoneme that matches the database, and attribute information given to the block phoneme for all block phonemes belonging to the same group as the block phoneme. The same attribute information asIt is characterized by that.
[0009]
  According to the present invention, block data is created based on the similarity of a plurality of phoneme data obtained by analyzing a time-series signal, while autocorrelation analysis is performed on the time-series signal to perform the period of the time-series signal. All blocks data are classified into multiple blocks, with the created blocks separated by periodic intervals as the same group.,By using a knowledge database in which a heart sound waveform is registered in association with a heart sound attribute including the characteristics and period of each block, it is possible to automatically add the heart sound attribute.
[0010]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
(1. Basic principle of time series signal analysis)
First, the basic principle of frequency analysis that is partially used for analysis of time series signals will be described. Since this basic principle is disclosed in the above-mentioned publications or specifications, only the outline will be briefly described here. As shown in the upper part of FIG. 1, it is assumed that an analog acoustic signal is given as a time-series intensity signal. In the illustrated example, the acoustic signal is shown with the time axis t on the horizontal axis and the signal intensity A on the vertical axis. In the present invention, first, the analog sound signal is processed as a digital sound signal. This can be done by using a conventional general PCM method, sampling the analog acoustic signal at a predetermined sampling frequency, and converting the signal intensity A into digital data using a predetermined number of quantization bits. . Here, for convenience of explanation, the waveform of the acoustic signal digitized by the PCM technique is also shown by the same waveform as the analog acoustic signal in the upper stage of FIG.
[0011]
Next, a plurality of unit sections are set on the time axis t of the digital acoustic signal. In the illustrated example, six unit sections U1 to U6 are set. The position and length of the i-th unit section Ui on the time axis t are indicated by the coordinate values of the start end si and end ei on the time axis t. For example, the unit section U1 is a section having a length of (e1-s1) from the start end s1 to the end e1.
[0012]
In this way, when a plurality of unit sections are set, a predetermined representative frequency and representative intensity representing each unit section are defined based on the acoustic signal in each unit section. Here, a state in which the representative frequency Fi and the representative intensity Ai are defined for the i-th unit section Ui is shown. For example, the representative frequency F1 and the representative intensity A1 are defined for the first unit section U1. The representative frequency F1 is a representative value of the frequency component of the acoustic signal included in the section from the start end s1 to the end e1, and the representative intensity Ai is the acoustic signal included in the section from the start end s1 to the end e1. This is a representative value of the signal intensity. In general, the frequency component included in the acoustic signal in the unit section U1 is not single, and the signal intensity generally varies. The basic principle common to the conventional methods is that one or a plurality of representative frequencies and representative intensities are defined for one unit section, and encoding is performed using these representative values.
[0013]
That is, when the representative frequency and the representative strength are defined for each unit section, information indicating the start position and the end position of each unit section on the time axis t, and the defined representative frequency and representative strength are indicated. Code data is generated based on the information, and the acoustic signal of each unit section is expressed by each code data. As a technique for encoding an event that an acoustic signal having a single frequency and a single signal intensity lasts for a predetermined period, encoding based on the MIDI standard can be used. Code data (MIDI data) according to the MIDI standard can be said to be data expressing a sound by a note, and FIG. 1 shows a concept of code data finally obtained by a note shown in the lower stage.
[0014]
Eventually, the acoustic signal in each unit section includes pitch information (note number in the MIDI standard) corresponding to the representative frequency F1, intensity information (velocity in the MIDI standard) corresponding to the representative intensity A1, and the length of the unit section. It is converted into code data having length information (delta time in the MIDI standard) corresponding to (e1-s1). The information amount of the code data obtained in this way is significantly smaller than the information amount of the original acoustic signal, and a dramatic coding efficiency can be obtained. When the representative frequency is a function f (n) of the note number n, the relationship between the representative frequency f (n) and the note number n (0 ≦ n ≦ 127) is defined by [Formula 1] shown below. Will be.
[0015]
[Formula 1]
f (n) = 440 × 2^γ ⁽ⁿ⁾
γ (n) = (n−69) / 12
[0016]
Further, the frequency and the period have a reciprocal relationship, and if one is determined, the other is uniquely determined, and if the frequency is determined by the above [Equation 1], the note number is also determined. Accordingly, in the present invention, even when any one of the terms of frequency, period, and note number is used, the other two are assumed to be determined at the same time.
[0017]
As a specific method for calculating a set of representative frequency and intensity for each unit section as described above, a well-known method such as a short-time Fourier analysis method, a generalized harmonic analysis method, or a zero crossing detection method can be used.
[0018]
(2. Time-series signal analysis method according to the present invention)
Next, a specific method of time series signal analysis according to the present invention will be described. FIG. 2 is a flowchart showing an outline of time-series signal analysis according to the present invention. As shown in FIG. 2, two types of processing are performed on a digital acoustic signal captured in the PCM format. First, frequency analysis of a time series signal is performed (step S1). Here, an example of a heart sound waveform captured as a time-series signal is shown in FIG. In step S1, unit intervals are set for the heart sound waveform (signal) as shown in FIG. 3 as described with reference to FIG. 1 in the basic principle, and a plurality of sets of frequencies and intensities are set for each unit interval. Is calculated.
[0019]
As a specific method for calculating a set of frequency and intensity, short-time Fourier transform, generalized harmonic analysis, and zero-crossing detection method can be used as described above. In the present invention, the zero-crossing detection method is applied. Is effective. In particular, in the frequency analysis in step S1, a set of frequency and intensity is calculated by a method developed from the zero crossing detection method. Details of the frequency analysis in step S1 will be described using the flowchart of FIG.
[0020]
First, a main zero crossing point is detected from the input original sound signal (step S11). The main zero crossing indicates a zero crossing that is a point at which the acoustic signal becomes 0 level, which is detected with respect to the original acoustic signal that has not been corrected. Here, FIG. 5A shows an enlarged part of the original sound signal shown in FIG. 3 in the time axis direction. In step S11, a zero crossing point at which the signal becomes 0 level is detected. This zero crossing is expressed by time. The main zero crossing point detected from the original acoustic signal in FIG. 5A is shown in FIG.
[0021]
Next, a unit section is determined according to the main zero crossing detected in step S11 (step S12). The unit interval is a basic interval for extracting frequency components and creating code data, and corresponds to the unit interval U in FIG. The unit section is set as a section from the main zero intersection t1 as the start point to the main zero intersection t2 as the end point. As the main zero crossing point t1 that is the starting point, the main zero crossing point that is temporally leading is selected in a section that has not yet been set as a unit section. The main zero crossing t2 that is the end point is selected as follows.
[0022]
As shown in FIG. 5A, the waveform of the original sound signal is one cycle at the zero crossing point at a position moved two from the zero crossing point as the starting point. For this reason, it is preferable that the position at which the zero crossing point multiplied by an integer is moved is the end point. In order to obtain a stable frequency value, it is preferable that the end point is located as far as possible from the start point and the average period is the period of this unit section. For this purpose, first, the distance from the main zero crossing point t1 that is the starting point to the main zero crossing point that has been moved two times is expressed as a unit period T._BAnd set. As the main zero crossing point t2 that is the end point, the main zero crossing point that is separated from the main zero crossing point t1 that is the starting point by an integer multiple of 2 is a range that satisfies the following [Equation 2] and has the maximum t2-t1. Try to select.
[0023]
[Formula 2]
｜ N_A-N_B| ≦ 1
However, N = 40 × log_Ten(1 / 440T) +69
T = (t2−t1) × 2 / k
[0024]
In the above [Formula 2], N_AIs the average period T between t2 and t1_ANote number determined by N, N_BIs the unit period T_BNote number determined by. K is the number of main zero crossings that have moved from the main zero crossing that is the start point to the main zero crossing that is the end point, and is an integer multiple of two. In the above [Expression 2], the second expression is for obtaining the note number N from the period T, and the third expression is the average period T_AThe unit cycle T is obtained when k = 2._BIs obtained. In the above [Equation 2], the main zero crossing point t2 as the end point is separated from the main zero crossing point t1 as the starting point by four, six, eight,... Make a temporary decision at the intersection. After all, the process in step S12 is performed by unit cycle T._BAnd the average period T_AWhen the frequency determined by (1) is converted into a note number, the main zero crossing point t2 that falls within a semitone difference (note number difference of ± 1) is selected as the end point. The unit interval is set over the entire interval of the acoustic signal. In the subsequent unit section, processing is performed in the same manner as described above, with the main zero intersection next to the main zero intersection serving as the end point of the preceding unit section as the main zero intersection serving as the start point.
[0025]
When all the unit sections are set, the main frequency component is calculated for each unit section (step S13). The frequency component means a period (frequency) and an amplitude. As the period, the average period T in the unit section_AAs the amplitude, a value A that maximizes the absolute value of the signal in the unit interval is set. The main frequency component means a frequency component extracted from an original sound signal that has not been corrected, and is distinguished from the sub-frequency component calculated in step S16.
[0026]
The processes in steps S12 and S13 described above, that is, the process of determining the unit section using the zero crossing and calculating the frequency component corresponding to the unit section are the same as those conventionally performed. However, in this case, only one frequency component can be extracted for one unit section. In the present invention, in order to extract a plurality of frequency components from one unit section, the processing in step S14 and subsequent steps as described later is performed.
[0027]
If the main frequency component is calculated in step S13, then the DC component is removed from the acoustic signal in the unit section (step S14). Here, the removal of the direct current component is not performed uniformly over the entire unit section, but for the portion where the acoustic signal is 0 or more, the average intensity of the portion is subtracted from the signal strength at each time. For the part where the acoustic signal is 0 or less, the average intensity of the part is also subtracted from the signal intensity with the positive / negative sign as it is (actually, the average intensity is increased). As a result, a corrected acoustic signal in which a portion with a large amplitude is suppressed is obtained. FIG. 5D shows a corrected sound signal obtained by correcting the original sound signal shown in FIG.
[0028]
Subsequently, a sub-zero intersection where the signal crosses the 0 level is detected with respect to the corrected acoustic signal (step S15). This is performed in exactly the same manner as in step S11 in which the main zero crossing point is detected for the original sound signal. However, as shown in FIG. 5D, in the corrected acoustic signal, the number of zero-crossing points is clearly increased as compared with the original sound signal, so that many sub-zero crossing points are detected. FIG. 5E shows the sub-zero crossing detected from the corrected acoustic signal shown in FIG.
[0029]
Subsequently, sub-frequency components are calculated according to the corrected acoustic signal and the detected sub-zero intersection (step S16). Of the sub-frequency components, the period is given as an average period determined by the sub-zero crossing t3 as the start point and the sub-zero crossing t4 as the end point in the unit section. In addition, among the sub-frequency components, the amplitude is set to a value that maximizes the absolute value of the corrected acoustic signal within the unit interval.
[0030]
The processes in steps S14 to S16 are repeated a predetermined number of times, and sub frequency components corresponding to the repeated number of times are extracted. As described above, intensities corresponding to a plurality of frequencies can be detected for each unit section, and by processing over all sections of the original sound signal, a set of frequency and intensity is set for all set unit sections. It can be calculated.
[0031]
If the beginning of each unit section is the start time and the end point is the end time, a set of start time / end time / frequency / intensity is obtained for each unit section. This set of start time / end time / frequency / intensity is referred to as phoneme data in this specification. In step S1, the phoneme data whose intensity does not reach a predetermined value is also deleted. As a result, the remaining phoneme data is shown in FIG.
[0032]
In FIG. 6, the phoneme data is indicated by a downward triangle with the frequency as the pitch of the sound and the intensity as the strength of the sound. The pitch of the sound is represented by the position in the vertical direction of the upper side of the triangle, and the strength of the sound is represented by the height of the triangle. Further, the length of the unit section is represented by the length of the upper side of the triangle. The phoneme data selected in the frequency analysis of step S1 is preferably about 16 for each unit section. However, in order to avoid making the drawing complicated, the example of FIG. 6 shows three for each unit section. Yes.
[0033]
When phoneme data is selected for each unit section, a block phoneme is created based on the selected phoneme data (step S2). Block phoneme creation processing is performed by concatenating phoneme data having similar note numbers and velocities on the time axis. The average value of each constituent phoneme data is given as the note number of the block phoneme, and the maximum value of each constituent phoneme data is given as the velocity. For each unit section, about 16 pieces of phoneme data are usually extracted by frequency analysis. In this case, a plurality of block phonemes overlapping on the time axis are created. However, since it is complicated to attach attribute information in the latter stage, temporally overlapping block phonemes are represented only by the one having the highest velocity and integrated into one. The following is a specific method described in Japanese Patent Application No. 2000-351775. For each unit section, the unified frequency Nr (t) and the unified intensity Vr (t) are calculated by the following [Equation 3].
[0034]
[Formula 3]
Nr (t) = [Σ {n × Vn (t)}] / ΣVn (t)
Vr (t) = [Σ {Vn (t)}²]^1/2
[0035]
In the above [Expression 3], n is a note number, and Vn (t) is a velocity corresponding to the note number n. Further, the sum calculated by Σ is performed for the phoneme data. That is, when 16 phoneme data are selected, the sum total of 16 is calculated. [Expression 3] As the unified frequency Nr (t), the frequency of the center of gravity when each phoneme data is plotted on the graph composed of the frequency axis and the intensity axis is calculated, and the unified intensity Vr (t) As a result, the root mean square value of the intensity distribution is calculated. In this way, a block phoneme having a uniform frequency and a uniform strength is obtained for each unit section. Further, when the unified frequencies are similar between adjacent block phonemes, they are integrated across a plurality of unit intervals, and a new block phoneme is obtained. The unified frequency is similar when a frequency difference is included in a preset range. As the frequency and intensity of the block phonemes after integration, an average value of the frequency and intensity of the block phonemes before integration is given. FIG. 7 shows a block phoneme obtained by the process of step S2 for the phoneme data shown in FIG. In the example of FIG. 7, the pitch and the strength of the sound are ignored.
[0036]
Apart from the processing in steps S1 and S2, the fundamental period is calculated for the input acoustic signal (step S3). This is done by performing an autocorrelation analysis. Specifically, for a given acoustic signal g (t), the period T_nAnd the correlation value R (T_n) Is the maximum period T_nAsk for.
[0037]
[Formula 4]
R (T_n) = Σ_tg (t) × g (t + T_n)
[0038]
Period T_n128 are assigned to each note number (n = 0 to 127). FIG. 8 shows the input acoustic signal g (t) and the period T_nAcoustic signal g (t + T)_nFIG. FIG. 8A shows the input acoustic signal g (t), and FIG. 8B shows the acoustic signal g (t + T).₁₂₇), FIG. 8C shows the acoustic signal g (t + T₁₂₆) And FIG. 8D show the acoustic signal g (t + T)._n). Such an acoustic signal g (t + T_n), The correlation value R (T) with the original acoustic signal g (t)_n) When T is the maximum_nIs regarded as the period of the input acoustic signal g (t).
[0039]
Subsequently, the block phoneme obtained in step S2 is replaced with the cycle T obtained in step S3._n(Step S4). Specifically, the period T_nCompare note numbers of block phonemes that are far apart. When the difference between the note numbers is within a predetermined range, both blocks are classified into the same channel. The number of channels to be classified varies depending on the number of block phonemes and the range of difference in note numbers regarded as the same, but is divided into channels that are larger than the number of channels of final output data. For example, the block data as shown in FIG. 7 is classified into eight channels as shown in FIG.
[0040]
Next, a heart sound attribute is given to the block phonemes of each classified channel (step S5). Specifically, a database on heart sounds was prepared in advance, and the block phoneme patterns stored in this database were compared with the block phoneme patterns of each classified channel, and determined to be similar. In this case, the heart sound attribute registered in the database is assigned to the channel. The block phoneme pattern is an array pattern of phoneme data that is a basis for creating each block phoneme. Specifically, for each block phoneme, three types of attribute information such as I sound, II sound and the like are given. This determination is made based on the following four rules.
[0041]
Pitch: II sound is higher than I sound. (II sound> I sound)
Sound length: The length of both the I and II sounds is fixed short, and the II sound is shorter than the I sound. (I sound and II sound <length threshold and I sound> II sound)
Sound intensity: The I and II sounds are stronger than the other components, and the balance between the I and II sounds changes depending on the position of the listening sensor. As an exception, reflux heart murmurs may be stronger than I and II sounds. (I and II sounds> strength threshold)
Sound interval: The interval between I and II sounds is shorter than the interval between II and I sounds. As the heart rate increases, the interval between the II and I sounds decreases, but the interval between the I and II sounds does not change, that is, approaches the interval between the I and II sounds. (I-II interval <II-I interval)
[0042]
As a result, the block phoneme shown in FIG. 9 is given the attribute information shown in FIG. 10 for each channel. Subsequently, attribute information corresponding to each channel is assigned with reference to a database in which more detailed information is recorded. Based on the phoneme data array pattern that is the basis for creating block phonemes, first, for block phonemes to which attribute information of I and II sounds has been assigned, the detailed classification and blocks to which other attribute information has been assigned For phonemes, refer to the database where the detailed attribute information is registered from the arrangement pattern of the phoneme data for specific attribute information, III sounds, IV sounds (or the combined sounds of III and IV), heart noise (shrinkage) Sub-category such as period, diastole, continuity, reflux or drivability is also defined), click sound, friction sound, etc. are determined, and corresponding detailed attribute information is given. In particular, it is possible to specify whether it is a mitral valve component, a tricuspid valve component, or an aortic valve component for the first sound, and whether it is an aortic valve component or a pulmonary valve component for the second sound . FIG. 11 shows block phonemes to which detailed attribute information is assigned for each channel.
[0043]
In step S5, it is possible to specify what kind of heart sound attribute the data of each channel has, but it is also possible to specify it by a person without performing it automatically. In this case, a state in which block phonemes as shown in FIG. 9 are classified into a plurality of channels is displayed on a screen or the like. In the example of FIG. 9, a block state is shown, but in reality, a state in which phoneme data as shown in FIG. 6 is overlaid is displayed. Looking at such a display, it is possible to simultaneously confirm both the state and period of the phoneme data constituting each block phoneme.
[0044]
A person who can determine a heart sound attribute based on a heart sound waveform, such as a doctor, assigns a heart sound attribute to each channel while looking at this frequency element. As shown in FIG. 9, unlike the conventional heart sound waveform, block phonemes appearing at a predetermined cycle interval also appear in the same channel, so that more accurate determination is possible. For example, when only one block phoneme is viewed, even if it is likely to be a heart sound component, it is determined that noise that does not appear periodically is noise and can be excluded.
[0045]
Subsequently, the block phoneme to which the attribute information is added is further documented as a structure (step S6). FIG. 12 shows an example of source code when the XML (eXtensible Markup Language) standard is adopted as a structured document. The example shown in FIG. 12 is a structural document of one period of a heart sound from the beginning of the block phoneme shown in FIG. 11, and is described in 37 lines. Note that the block phonemes described in FIG. 12 are described according to MIDI. XML is basically in a format in which actual data is surrounded by a pair of tags. In FIG. 12, the start and end of a document are defined by a pair of tags on the first and last 37th lines. In the pair of tags described in the second line, the pronunciation start time (<StartTime>) defined in the MIDI standard-compliant event data (defined by the <Event> tag) described from the start to the end of this document The unit of time (definition per second) of the pronunciation end time (defined by the <EndTime> tag) is defined. The <HeartCycle> tag on the third line and the end tag </ HeartCycle> on the 36th line indicate that data for one cycle of the heart sound is described below. The 4th to 14th lines describe the heart sound I sound <FirstSound>, and the 4th and 14th lines are a pair of tags indicating the heart sound I sound. The heart sound I sound is further classified into three detailed attributes. The 5th to 7th lines show the mitral valve component M1, the 8th to 10th lines show the tricuspid valve component T1, and the 11th to 13th lines show the aortic valve component, respectively. Each classification has the same configuration. For example, in the case of the mitral valve component M1, the sixth line and the seventh line respectively describe unified code data according to the MIDI standard. For example, <StartTime> and <EndTime> on the sixth line describe the time expressed as a relative time called delta time in the MIDI standard as an absolute time, and are pronounced from time “10” to time “20”. It is shown that. Further, <Pitch> in the seventh line corresponds to a MIDI standard note number, and indicates that the sound is generated at a pitch corresponding to the note number “32”. The <Level> in the seventh line corresponds to the velocity of the MIDI standard, and indicates that the sound is generated with the sound intensity corresponding to the velocity “60”.
[0046]
When the created structured document is in the XML format, it can be browsed by an Internet browser, and heart sounds can be reproduced as an acoustic signal using dedicated player software and a MIDI sound source. Specifically, the created structured document is registered in the WWW server, and the user starts a browser on his / her personal computer and accesses the WWW server via the Internet to obtain an XML document. Next, the MIDI sequencer software plugged into the browser reproduces the sound signal while controlling the MIDI sound source according to the MIDI data recorded in the XML document. Encoding a heart sound and recording it in the XML format in this way is convenient not only for distribution via the Internet, but also suitable for reproduction using a MIDI sound source. In recent years, the XML format has been used. This is also effective from the viewpoint of consistency with medical chart information that has been digitized as an electronic medical chart.
[0047]
In addition to the above-described effect that heart sounds can be reproduced with a MIDI sound source when MIDI data to which a heart sound attribute is assigned is stored as an electronic medical record, it is possible to read a quantitative pathological condition necessary for diagnosis from a document as described below. effective. For example, I and II sounds have pathological conditions such as enhancement, attenuation, and division, and these pathological conditions can be judged from the numerical values of MIDI event data surrounded by attributes. That is, when the velocity (strength) enclosed by the <Level> tag or the note number (height) enclosed by the <Pitch> tag is higher than a predetermined value, it is enhanced, and when it is lower than the predetermined value, it is attenuated. In the example of the I sound and the II sound in FIG. 12, it is composed of two events (notes expressed by triangles) that are close on the time axis. In other words, if the difference between the value of the <EndTime> tag of the previous event and the value of the <StartTime> tag of the subsequent event is equal to or greater than a predetermined value, it can be determined that the cell is split. Also, in clinical diagnosis of heart murmurs, the Levine classification (class 1 to 6), 1 is a weak noise that is difficult to hear by auscultation and cannot be discriminated unless it is a phonocardiogram, 6 is not a stethoscope (Strong noise that can be heard just by being on the side) is also used, and this can also be determined from the velocity (intensity) enclosed by the <Level> tag.
[0048]
Next, an apparatus for realizing the above-described time-series signal analysis method according to the present invention will be described. FIG. 13 is a functional block diagram illustrating an embodiment of a time-series signal analyzing apparatus. In FIG. 13, 1 is a time-series signal acquisition means, 2 is a frequency analysis means, 3 is a block phoneme creation means, 4 is a basic period calculation means, 5 is a block phoneme classification means, 6 is attribute information addition means, and 7 is an attribute database. , 8 is a detailed attribute database, 9 is an output means, and 10 is a structured document creation means. The time-series signal analyzing apparatus shown in FIG. 13 is actually realized by connecting an audio-related peripheral device to a computer and mounting a dedicated program.
[0049]
The time-series signal acquisition means 1 is for acquiring a time-series signal, and when acquiring a heart sound signal, a sampling device that attaches a microphone to a stethoscope and converts the acquired heart sound into a digital signal such as PCM. It is realized by. The frequency analysis means 2 has a function of performing frequency analysis of the acquired time-series signal to create phoneme data. Specifically, the processing of step S1 in FIG. 2 is executed according to a computer program. The block phoneme creation unit 3 has a function of creating a block phoneme based on the phoneme data obtained by the frequency analysis unit 2, and specifically executes the process of step S2 in FIG. 2 according to a computer program. The basic period calculation means 4 has a function of calculating the basic period by performing autocorrelation analysis of the time-series signal, and specifically, executes the process of step S3 in FIG. 2 according to the computer program. The block phoneme classifying means 5 has a function of classifying block phonemes into a plurality of groups based on the block phonemes created by the block phoneme creating means 3 and the basic periods calculated by the basic period calculating means 4. In step S4 in FIG. 2, the process is executed according to a computer program. The attribute information assigning means 6 assigns heart sound attribute information to each channel in which block phonemes are classified according to a preset rule, and creates an array pattern of phoneme data from which block phonemes are created. Based on this, the detailed attribute information extracted from the detailed attribute database 8 is further added. Specifically, the process of step S5 in FIG. 2 is executed according to the computer program. The attribute database 7 is a database in which correspondence between phoneme data arrangement patterns and each heart sound attribute is recorded in order to classify the first sound, the second sound, and others, and the detailed attribute database 8 is the first sound, the second sound. In order to specify the sound in more detail, for the first sound, record the arrangement pattern of phoneme data corresponding to each of the mitral valve component, tricuspid valve component, and aortic valve component, and for the second sound, the aortic valve The phoneme data array pattern corresponding to each of the components and pulmonary valve components is recorded, and the heart sounds classified as "others" are the third, fourth (or III and IV superposed sounds), heart Records each phoneme data array pattern to identify noise (sub-phases such as systole or diastole, continuity, reflux or drive), click sounds, and friction sounds is doing. The output means 9 has a function of displaying and printing the classified block phonemes. Here, the output block phonemes are not used to automatically assign the heart sound attribute in step S5 of FIG. 2, but are useful for medical personnel to add the heart sound attribute. By looking at the block phonemes classified for each channel, it is possible to assign a heart sound attribute considering the basic period. Specifically, the output means 9 is realized by a display device such as a CRT or liquid crystal, or a printer device of various monochrome or color printing methods. The structured document creating means 10 has a function of converting block phonemes to which attribute information of heart sounds is assigned into text information such as XML standard. Specifically, the processing of step S6 in FIG. Follow the instructions.
[0050]
The preferred embodiments of the present invention have been described above, but the present invention is not limited to the above-described embodiments, and various modifications can be made. In the above embodiment, the case where heart sounds are analyzed as a time series signal has been described. However, the present invention can also be applied to the analysis of other bioacoustic signals such as respiratory sounds, and non-acoustic biological time series signals such as an electrocardiogram. It is also possible to analyze an acoustic signal such as sound as a time series signal and apply it to a failure diagnosis of an automobile engine.
[0051]
【The invention's effect】
  As described above, according to the present invention, the frequency analysis is performed for each unit section by performing frequency analysis on the time series signal for each predetermined unit section set for a given time series signal. A plurality of phoneme data composed of a set of intensity and intensity is created, and among the plurality of created phoneme data, those satisfying a predetermined intensity are extracted, and start time / end time of the extracted plurality of phoneme data・ Based on frequency similarity, create block phonemes consisting of start time, end time, frequency, and intensity parameters, and perform autocorrelation analysis on the time series signal. Periods are calculated, and those with similar frequencies among multiple block phonemes and whose start time difference is close to an integral multiple of the calculated basic period are classified into the same group, and are classified. The Loop unitso,By using a knowledge database in which a heart sound waveform is registered in association with a heart sound attribute including the characteristics and period of each block, it is possible to automatically add the heart sound attribute.
[Brief description of the drawings]
FIG. 1 is a diagram showing a basic principle of signal analysis in a time-series signal analysis apparatus of the present invention.
FIG. 2 is a flowchart showing an outline of analysis of a time-series signal according to the present invention.
FIG. 3 is a diagram illustrating an example of a heart sound waveform captured as a time-series signal.
FIG. 4 is a flowchart showing details of step S1 in FIG.
FIG. 5 is a diagram for explaining setting of unit sections and determination of period / amplitude in an acoustic signal.
6 is a diagram showing phoneme data generated by frequency analysis from the heart sound waveform shown in FIG. 3; FIG.
7 is a diagram showing block phonemes calculated based on the phoneme data shown in FIG. 6. FIG.
FIG. 8: Input acoustic signal g (t) and period T_nAcoustic signal g (t + T)_nFIG.
9 is a diagram illustrating a state in which the block phonemes illustrated in FIG. 7 are classified based on a basic period.
10 is a diagram showing a state in which attribute information is given to each block phoneme shown in FIG. 9;
11 is a diagram showing a state in which more detailed attribute information is given to each block phoneme shown in FIG.
FIG. 12 is a diagram illustrating a source code when the XML standard is adopted as a structured document.
FIG. 13 is a functional block diagram showing an embodiment of a time-series signal analyzing apparatus according to the present invention.
[Explanation of symbols]
1 ... Time-series signal acquisition means
2 Frequency analysis means
3 ... Block phoneme creation means
4. Basic period calculation means
5 ... Block phoneme classification means
6 ... Attribute information giving means
7 ... attribute database
8 ... Detailed attribute database
9 ... Output means
10 ... Structured document creation means

Claims

Frequency analysis means for generating a plurality of phoneme data composed of a set of frequency and intensity for each predetermined unit section set for a given time-series signal;
Of the plurality of generated phoneme data, those that satisfy a predetermined intensity are extracted, and based on the similarity of the start time / end time / frequency of the extracted plurality of phoneme data, the start time / end time / frequency Block phoneme creation means for creating block phonemes composed of intensity parameters;
A basic period calculating means for calculating a basic period of the time series signal by performing autocorrelation analysis on the time series signal;
Block phoneme classifying means for classifying the plurality of block phonemes whose frequencies are similar and whose start time difference is an integer multiple of the calculated basic period so as to be in the same group;
Attribute information giving means for giving attribute information defined in a predetermined database to each of the grouped block phonemes;
Have a, and output means for outputting a block phonemes the attribute information is assigned,
The attribute information assigning means assigns attribute information to a block phoneme that matches the database, and attribute information given to the block phoneme for all block phonemes belonging to the same group as the block phoneme time series signal analyzer according to claim der Rukoto those that confer the same attribute information.

Frequency analysis means for generating a plurality of phoneme data composed of a set of frequency and intensity for each predetermined unit section set for a given time-series signal;
Of the plurality of generated phoneme data, those that satisfy a predetermined intensity are extracted, and based on the similarity of the start time / end time / frequency of the extracted plurality of phoneme data, the start time / end time / frequency Block phoneme creation means for creating block phonemes composed of intensity parameters;
A basic period calculating means for calculating a basic period of the time series signal by performing autocorrelation analysis on the time series signal;
Block phoneme classifying means for classifying the plurality of block phonemes whose frequencies are similar and whose start time difference is an integer multiple of the calculated basic period so as to be in the same group;
Attribute information giving means for giving attribute information defined in a predetermined database to each of the grouped block phonemes;
Have a, a structured document creating means for outputting a block phonemes said attribute information is given as text information of the hierarchical structure,
The attribute information assigning means assigns attribute information to a block phoneme that matches the database, and attribute information given to the block phoneme for all block phonemes belonging to the same group as the block phoneme time series signal analyzer according to claim der Rukoto those that confer the same attribute information.

The text information output by the structured document creation means is described in conformity with the format of the XML standard, and the block phonemes contained therein correspond to the delta time information corresponding to the section, the note number information corresponding to the frequency, and the intensity. 3. The time-series signal analyzing apparatus according to claim 2 , wherein the time-series signal analyzing apparatus is described in a MIDI format having velocity information.

A detailed attribute database that records the correspondence between the phoneme data array pattern and the detailed attribute information;
The attribute information adding means refers to the detailed attribute database and adds detailed attribute information corresponding to an arrangement pattern of phoneme data that is a basis of the block phoneme. Item 4. The time-series signal analysis device according to any one of Items 3 to 4.

The time-series signal is a heart sound signal, the basic period is a heartbeat period, the attribute information is an I sound component, a II sound component and others of the heart sound, and in the detailed attribute database, it is an I sound component M1, T1, A1, and A2, P2, III sound components, IV sound components, and other abnormal heart sound components, such as M2, T1, A1, and II, are associated with phoneme data arrangement patterns. The time-series signal analyzing apparatus according to claim 4 .

The basic period calculation means obtains a correlation value with the time series signal moved by the period while changing the period, and calculates the period when the correlation value is maximum as the basic period of the time series signal. 6. The time-series signal analyzing apparatus according to claim 1, wherein the time-series signal analyzing apparatus is one.