JP5318042B2

JP5318042B2 - Signal analysis apparatus, signal analysis method, and signal analysis program

Info

Publication number: JP5318042B2
Application number: JP2010159604A
Authority: JP
Inventors: 康智大石; 弘和亀岡; 大地持橋; 秀尚永野; 邦夫柏野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2010-07-14
Filing date: 2010-07-14
Publication date: 2013-10-16
Anticipated expiration: 2030-07-14
Also published as: JP2012022128A

Abstract

<P>PROBLEM TO BE SOLVED: To provide a signal analysis apparatus for extracting dynamic characteristics of a time-series signal. <P>SOLUTION: When a signal analysis apparatus extracts dynamic characteristics of a time-series signal by estimating an input signal parameter, a filter property parameter, and a residual signal parameter from an observation signal while the observation signal is indicated by a sum of an output signal in a signal generation system and a residual signal obtained by convolution between an input signal and an impulse response signal indicating a filter property, the signal analysis apparatus comprises: a parameter initial value generation unit for generating initial values of the parameters; a signal separation unit for separating the observation signal; a model parameter updating unit for updating a model parameter to maximize a target function with regard to the model parameter; a parameter convergence determination unit for making the signal separation unit and the model parameter updating unit perform processes again until the model parameter satisfies a predetermined criterion; and a parameter output unit for outputting the model parameter if it is judged that the model parameter satisfies a predetermined criterion. <P>COPYRIGHT: (C)2012,JPO&INPIT

Description

本発明は、時系列信号の動特性特徴を抽出する信号解析装置、信号解析方法及び信号解析プログラムに関する。 The present invention relates to a signal analysis apparatus, a signal analysis method, and a signal analysis program that extract dynamic characteristic features of time series signals.

歌声音響信号から抽出される基本周波数系列を例に挙げて、従来技術を説明する。この歌声の基本周波数系列には、歌唱者が歌おうとする音高目標値系列と歌唱力・歌唱スタイル・個人性・感情に基づく様々な動的変動成分（オーバーシュートやビブラートなど）が複雑に重ね合わされている。歌声は、多くのジャンルの音楽を特徴づける重要な要素の一つであり、この歌声の基本周波数系列に着目した様々な研究が現在盛んに行われている。 The prior art will be described using a basic frequency sequence extracted from a singing voice acoustic signal as an example. In the basic frequency series of this singing voice, the pitch target value series that the singer wants to sing and various dynamic fluctuation components (overshoot, vibrato, etc.) based on singing power, singing style, personality, and emotion are overlaid in a complex manner. Has been. Singing voice is one of the important elements that characterize many genres of music, and various studies focusing on the fundamental frequency series of singing voice are currently being actively conducted.

基本周波数系列に含まれる音高目標値系列を特徴抽出できれば、ハミング検索や自動採譜への応用が期待される。特に、歌声から楽曲を検索するハミング検索では、歌唱された歌声の基本周波数系列から、歌唱者の意図する音高目標値系列を正しく推定して、楽曲データベースの旋律と照合する必要がある。 If feature extraction of the pitch target value sequence included in the fundamental frequency sequence can be performed, application to hamming search and automatic music transcription is expected. In particular, in the Hamming search for searching for music from a singing voice, it is necessary to correctly estimate a pitch target value series intended by the singer from the basic frequency series of sung singing voices, and to collate with the melody of the music database.

一方で、オーバーシュートやビブラートのような基本周波数系列の動的変動成分は、歌声知覚、個人性知覚に影響を与える成分であることが知られている。したがって、歌唱スタイルの記述やそれを利用した類似歌声検索、歌唱力自動評価のための有用な尺度となりうる。また、より表情豊かかつ多様な歌声合成のためにも必要不可欠な成分である。そこで従来研究では、線形２次系を利用して歌声の基本周波数の動的変動成分を制御するモデルが提案されている（例えば、非特許文献１、２、３参照）。 On the other hand, it is known that dynamic fluctuation components of the fundamental frequency series such as overshoot and vibrato are components that affect singing voice perception and personality perception. Therefore, it can be a useful measure for description of singing style, similar singing voice search using the singing style, and automatic evaluation of singing ability. It is also an indispensable component for more expressive and diverse singing voice synthesis. Therefore, in the conventional research, a model for controlling a dynamic fluctuation component of the fundamental frequency of a singing voice using a linear quadratic system has been proposed (for example, see Non-Patent Documents 1, 2, and 3).

これらの研究では、日本語の話声の基本周波数パターンを表現する藤崎モデルが参考にされた。藤崎モデルは、臨界制動２次系のインパルス応答とステップ応答を利用して、日本語の句頭から句末に向けて緩やかに下降するフレーズ成分と、語句に対応して急激に上昇下降するアクセント成分を表現し、これらを重畳することで、基本周波数系列を記述する。ただし、歌声の旋律に伴った急激な基本周波数の上昇・下降の制御及び、ビブラートのような周期的な振動は、臨界制動系では表現できない。そのため、歌声の基本周波数制御モデルでは２次系の伝達関数

における減衰率ζを調整することによって、指数減衰（ζ＞１）、減衰振動（０＜ζ＜１、オーバーシュートに対応する）、臨界制動（ζ＝１）、定常振動（ζ＝０、ビブラートに対応する）からなる様々な振動現象を表現する。 In these studies, the Fujisaki model, which expresses the fundamental frequency pattern of Japanese speech, was referenced. The Fujisaki model uses the impulse response and step response of the critical braking secondary system, and the phrase component that slowly falls from the beginning of the Japanese phrase toward the end of the phrase, and the accent that rises and falls sharply corresponding to the phrase The fundamental frequency sequence is described by expressing the components and superimposing them. However, the control of the sudden rise and fall of the fundamental frequency accompanying the melody of the singing voice and the periodic vibration such as vibrato cannot be expressed by the critical braking system. Therefore, in the fundamental frequency control model of singing voice, the transfer function of the secondary system

By adjusting the damping rate ζ at, exponential damping (ζ> 1), damping vibration (0 <ζ <1, corresponding to overshoot), critical braking (ζ = 1), steady vibration (ζ = 0, vibrato) To represent various vibration phenomena.

非特許文献３では、音階を表す階段状信号に式（１）のインパルス応答を畳み込んで得られる基本周波数系列を利用して、表情豊かな歌声合成音を実現した。しかしながら、これらの従来技術では、制御パラメータ（減衰率ζと固有周波数Ω）が手作業あるいは規則に基づいて決定されるものであった。これに対し、非特許文献４では、入力となる階段状信号および線形２次系の制御パラメータがいずれも未知の下で、観測される基本周波数系列だけから、それらを同時に推定する確率的なフレームワークが提案された。これは、音高目標値系列を表現する隠れマルコフモデル（ＨＭＭ）と、差分近似に基づく式（１）のパラメトリックな表現によって、最尤なモデルパラメータを反復推定する学習アルゴリズムである。 In Non-Patent Document 3, an expressive singing voice synthesized sound is realized by using a fundamental frequency sequence obtained by convolving the impulse response of Expression (1) with a stepped signal representing a musical scale. However, in these conventional techniques, control parameters (attenuation rate ζ and natural frequency Ω) are determined manually or based on rules. On the other hand, in Non-Patent Document 4, a stochastic frame in which the input stepwise signal and the control parameter of the linear secondary system are both unknown and are estimated simultaneously from only the observed fundamental frequency sequence. Work was proposed. This is a learning algorithm that iteratively estimates the maximum likelihood model parameter using a hidden Markov model (HMM) that expresses a pitch target value series and a parametric expression of Equation (1) based on difference approximation.

N. Minematsu, B. Matsuoka, and K. Hirose, “Prosodic Modeling of Nagauta Singing and Its Evaluation," in Proc. SpeechProsody 2004, pp. 487-490, Mar. 2004N. Minematsu, B. Matsuoka, and K. Hirose, “Prosodic Modeling of Nagauta Singing and Its Evaluation,” in Proc. SpeechProsody 2004, pp. 487-490, Mar. 2004 T. Saitou, M. Unoki, and M. Akagi, “Development of an F0 control Model Based on F0 Dynamic Characteristics for Singing-Voice Synthesis," Speech Communication, vol.46, pp. 405-417, 2005T. Saitou, M. Unoki, and M. Akagi, “Development of an F0 control Model Based on F0 Dynamic Characteristics for Singing-Voice Synthesis,” Speech Communication, vol.46, pp. 405-417, 2005 T. Saitou, M. Goto, M. Unoki, and M. Akagi, “Speech-To-Singing Synthesis: Converting Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices," in Proc. WASPAA 2007, pp. 215-218, Oct. 2007T. Saitou, M. Goto, M. Unoki, and M. Akagi, “Speech-To-Singing Synthesis: Converting Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices,” in Proc. WASPAA 2007, pp. 215 -218, Oct. 2007 Y. Ohishi, H. Kameoka, K. Kashino, and K. Takeda, “Parameter Estimation Method of F0 Control Model for Singing Voices," in Proc INTERSPEECH 2008, pp. 139-142, Sep. 2008Y. Ohishi, H. Kameoka, K. Kashino, and K. Takeda, “Parameter Estimation Method of F0 Control Model for Singing Voices,” in Proc INTERSPEECH 2008, pp. 139-142, Sep. 2008

しかしながら、上述した非特許文献４の従来技術では、モデルパラメータの推定性能が悪かった。すなわち、観測される基本周波数系列と、パラメータによって再合成される基本周波数系列との誤差が大きくなるという問題がある。これは、基本周波数系列には音高の立ち上がりやビブラートのような様々な動特性が混在しているため、２次系のパラメータを最尤法によって直接推定すると、ある特定の動特性に引っ張られるオーバーフィッティングの問題が発生するためである。これに対し、基本周波数系列をフレーム分割して、フレームごとにモデルパラメータを推定する手法も提案されたが、各動特性を生み出す２次系の影響範囲が系列上で不明確であるため、結局モデルパラメータを適切に推定できないという問題がある。 However, in the prior art of Non-Patent Document 4 described above, the model parameter estimation performance was poor. That is, there is a problem that an error between the observed fundamental frequency sequence and the fundamental frequency sequence re-synthesized according to the parameter increases. This is because various dynamic characteristics such as pitch rise and vibrato are mixed in the fundamental frequency sequence, and if a second-order parameter is directly estimated by the maximum likelihood method, it is pulled to a specific dynamic characteristic. This is because an overfitting problem occurs. On the other hand, a method of dividing the fundamental frequency sequence into frames and estimating model parameters for each frame has also been proposed, but the range of influence of the secondary system that generates each dynamic characteristic is unclear on the sequence, so There is a problem that model parameters cannot be estimated appropriately.

本発明は、このような事情に鑑みてなされたもので、時系列信号の動特性特徴を抽出する際に、入力となる階段状信号および線形２次系の制御パラメータがいずれも未知の下で、観測される基本周波数系列だけから、モデルパラメータを精度よく推定することができる信号解析装置、信号解析方法及び信号解析プログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, and when extracting the dynamic characteristic features of the time series signal, the input stepwise signal and the control parameter of the linear secondary system are both unknown. An object of the present invention is to provide a signal analysis apparatus, a signal analysis method, and a signal analysis program capable of accurately estimating model parameters from only the observed fundamental frequency series.

本発明は、観測信号［ｏ］を、入力信号［ｆ］とフィルタ特性を表すインパルス応答信号［ｈ］との畳みこみによって得られる信号生成系の出力信号［ｙ］と残差信号［ε］との和で表し、前記観測信号から、前記入力信号を表すモデルを構成する入力信号パラメータ［ｕ］と、前記フィルタ特性を表すモデルを構成するフィルタ特性パラメータ［｛ａ，ｂ｝または｛ａ_０，ａ_１，ａ_２｝または｛ｗ_１，ｗ_２，…，ｗ_Ｉ｝］と、前記残差信号を表すモデルを構成する残差信号パラメータ［β］とを推定することにより時系列信号の動特性特徴を抽出する信号解析装置であって、前記観測信号から前記入力信号パラメータと前記フィルタ特性パラメータと前記残差信号パラメータの初期値を生成するパラメータ初期値生成部と、前記フィルタ特性パラメータと前記入力信号パラメータと前記残差信号パラメータとの組をモデルパラメータ［Θ］とし、前記モデルパラメータを用いて、前記観測信号を、前記入力信号パラメータと前記フィルタ特性パラメータによって構成される前記信号生成系の出力信号と前記残差信号パラメータによって構成される前記残差信号とに分離する信号分離部と、前記観測信号と、前記モデルパラメータと、前記出力信号および前記残差信号の組とが与えられたときの、対数尤度関数の条件付き期待値に前記モデルパラメータの事前確率を足し合わせて得られるＱ関数［式２７］を目的関数として、当該目的関数を前記モデルパラメータに関して最大化するように前記モデルパラメータを更新するモデルパラメータ更新部と、前記モデルパラメータが所定の基準を満たしているか否かを判定し、所定の基準を満たしていないと判定された場合に所定の基準を満たすまで、前記信号分離部と前記モデルパラメータ更新部による処理とを再度行わせるパラメータ収束判定部と、前記パラメータ収束判定部により前記モデルパラメータが所定の規準を満たすと判定された場合に、当該モデルパラメータを出力するパラメータ出力部とを備えることを特徴とする。 In the present invention, an output signal [y] and a residual signal [ε] of a signal generation system obtained by convolving an observation signal [o] with an input signal [f] and an impulse response signal [h] representing filter characteristics. And the input signal parameter [u] constituting the model representing the input signal and the filter characteristic parameter [{a, b} or {a ₀ constituting the model representing the filter characteristic from the observed signal. , A ₁ , a ₂ } or {w ₁ , w ₂ ,..., W _I }] and a residual signal parameter [β] constituting the model representing the residual signal, thereby estimating the time series signal. A signal analysis apparatus for extracting a dynamic characteristic feature, comprising: a parameter initial value generation unit that generates initial values of the input signal parameter, the filter characteristic parameter, and the residual signal parameter from the observed signal; A set of a filter characteristic parameter, the input signal parameter, and the residual signal parameter is a model parameter [Θ], and the observation signal is configured by the input signal parameter and the filter characteristic parameter using the model parameter. A set of a signal separation unit for separating the output signal of the signal generation system and the residual signal configured by the residual signal parameter, the observation signal, the model parameter, the output signal, and the residual signal When the Q function [Equation 27] obtained by adding the prior probability of the model parameter to the conditional expected value of the log likelihood function is given as an objective function, the objective function is maximized with respect to the model parameter. A model parameter updating unit for updating the model parameter so as to be It is determined whether or not a predetermined standard is satisfied, and when it is determined that the predetermined standard is not satisfied, the processing by the signal separation unit and the model parameter update unit is performed again until the predetermined standard is satisfied. A parameter convergence determination unit; and a parameter output unit configured to output the model parameter when the parameter convergence determination unit determines that the model parameter satisfies a predetermined criterion.

本発明は、前記入力信号はステップ信号［式１７］であり、前記出力信号は多次元ガウス分布に従うものとして確率的にモデル化［式２１］され、前記残差信号は、ガウス性白色雑音として確率的にモデル化［式２２］されることを特徴とする。 In the present invention, the input signal is a step signal [Equation 17], the output signal is stochastically modeled as following a multidimensional Gaussian distribution [Equation 21], and the residual signal is expressed as Gaussian white noise. It is characterized by being probabilistically modeled [Equation 22].

本発明は、前記信号生成系のフィルタ特性は、差分法によって導出されるフィルタ［式５］で表され、前記フィルタ特性パラメータは、前記固有周波数の二乗に反比例するパラメータ［ａ＝１／Ω^２，式５］と、前記減衰率に比例し前記固有周波数に反比例するパラメータ［ｂ＝２ζ／Ω，式５］とであることを特徴とする。 According to the present invention, the filter characteristic of the signal generation system is expressed by a filter [Expression 5] derived by a difference method, and the filter characteristic parameter is a parameter [a = 1 / Ω ² inversely proportional to the square of the natural frequency. , Equation 5] and a parameter [b = 2ζ / Ω, Equation 5] proportional to the attenuation factor and inversely proportional to the natural frequency.

本発明は、前記信号生成系のフィルタ特性は、自己回帰過程に基づいて構成されるフィルタ［式１０］で表され、前記フィルタ特性パラメータは、自己回帰パラメータ［式９］であることを特徴とする。 In the present invention, the filter characteristic of the signal generation system is represented by a filter [Equation 10] configured based on an autoregressive process, and the filter characteristic parameter is an autoregressive parameter [Equation 9]. To do.

本発明は前記信号生成系のフィルタ特性は、複数の２次系フィルタの重み付き線形和によって構成されるフィルタ［式１５］で表され、前記フィルタ特性パラメータは、前記各２次系フィルタの重み［｛ｗ_１，ｗ_２，…，ｗ_Ｉ｝］であることを特徴とする。 In the present invention, the filter characteristic of the signal generation system is represented by a filter [Equation 15] configured by a weighted linear sum of a plurality of second-order filters, and the filter characteristic parameter is a weight of each second-order filter. [{W ₁ , w ₂ ,..., W _I }].

本発明は、前記モデルパラメータ更新部は、補助変数から構成される前記目的関数の補助関数を、前記固有周波数の二乗に反比例するパラメータ［ａ］と前記補助関数を前記減衰率に比例し前記固有周波数に反比例するパラメータ［ｂ］とでそれぞれ微分して得られる方程式からなる連立方程式［式３６と式３７］をフィルタ特性パラメータについて解くことにより、フィルタ特性パラメータ［｛ａ，ｂ｝］の値を更新するフィルタ特性パラメータ更新部と、前記補助関数を、前記入力信号パラメータで微分して得られる方程式［式３８］を解くことにより、入力信号パラメータを更新する入力信号パラメータ更新部と、前記補助関数を、前記残差信号パラメータで微分して得られる方程式［式３８］を解くことにより、残差信号パラメータを更新する残差信号パラメータ更新部とから構成されることを特徴とする。 According to the present invention, the model parameter update unit sets the auxiliary function of the objective function composed of auxiliary variables to a parameter [a] that is inversely proportional to the square of the natural frequency and the auxiliary function that is proportional to the attenuation rate and the eigenfunction. By solving simultaneous equations [Equations 36 and 37] consisting of equations obtained by differentiating each of the parameters [b] inversely proportional to the frequency with respect to the filter properties parameters, the values of the filter properties parameters [{a, b}] are obtained. A filter characteristic parameter updating unit for updating, an input signal parameter updating unit for updating an input signal parameter by solving an equation [Equation 38] obtained by differentiating the auxiliary function with the input signal parameter, and the auxiliary function By solving the equation [Equation 38] obtained by differentiating the signal with the residual signal parameter. Characterized in that it is composed of a residual signal parameter update section for updating.

本発明は、前記モデルパラメータ更新部は、前記目的関数を、前記フィルタ特性パラメータに含まれる各自己回帰パラメータ［｛ａ_０，ａ_１，ａ_２｝］でそれぞれ微分して得られる方程式［式４０，式４１，式４２］からなる連立方程式をフィルタ特性パラメータについて解くことにより、フィルタ特性パラメータの値を更新するフィルタ特性パラメータ更新部と、前記目的関数を、前記入力信号パラメータで微分して得られる方程式［式４３］を解くことにより、入力信号パラメータを更新する入力信号パラメータ更新部と、前記目的関数を、前記残差信号パラメータで微分して得られる方程式［式４３］を解くことにより、残差信号パラメータを更新する残差信号パラメータ更新部とから構成されることを特徴とする。 In the present invention, the model parameter update unit is configured to obtain an equation obtained by differentiating the objective function by each autoregressive parameter [{a ₀ , a ₁ , a ₂ }] included in the filter characteristic parameter [formula 40 , Equation 41, Equation 42] are solved for the filter characteristic parameter, and the filter characteristic parameter updating unit for updating the value of the filter characteristic parameter and the objective function are obtained by differentiating with the input signal parameter. By solving the equation [Equation 43], the input signal parameter updating unit for updating the input signal parameter and the equation [Equation 43] obtained by differentiating the objective function with the residual signal parameter are obtained. And a residual signal parameter updating unit for updating the difference signal parameter.

本発明は、前記モデルパラメータ更新部は、補助変数から構成される前記目的関数の補助関数を、前記フィルタ特性パラメータである各２次系フィルタの重み［｛ｗ_１，ｗ_２，…，ｗ_Ｉ｝］でそれぞれ微分して得られる方程式からなる非線形連立方程式［式４９または式５７］を、フィルタ特性パラメータについて解くことにより、フィルタ特性パラメータの値を更新するフィルタ特性パラメータ更新部と、前記補助関数を、前記入力信号パラメータで微分して得られる方程式［式５０または式５８］を解くことにより、入力信号パラメータを更新する入力信号パラメータ更新部と、前記補助関数を、前記残差信号パラメータで微分して得られる方程式［式５０または式５８］を解くことにより、残差信号パラメータを更新する残差信号パラメータ更新部とから構成されることを特徴とする。 In the present invention, the model parameter updating unit converts the auxiliary function of the objective function composed of auxiliary variables into weights [{w ₁ , w ₂ ,..., W _{I of the} second-order filters that are the filter characteristic parameters]. }], A nonlinear simultaneous equation [Equation 49 or 57] consisting of equations obtained by differentiating each of them by solving for the filter characteristic parameter, the filter characteristic parameter updating unit for updating the value of the filter characteristic parameter, and the auxiliary function Is obtained by differentiating the input signal parameter by solving the equation [Equation 50 or 58] obtained by differentiating the input signal parameter, and the auxiliary function is differentiated by the residual signal parameter. A residual signal that updates the residual signal parameters by solving the equation [Eq. 50 or Eq. 58] Characterized in that it is composed of a parameter update unit.

本発明は、前記信号分離部は、前記観測信号と前記モデルパラメータが与えられた時の、前記出力信号および前記残差信号から構成される完全データの期待値［式（２８）］と、前記完全データの自己相関［式（２９）］とを用いて、前記観測信号を出力信号と残差信号とに分離することを特徴とする。 According to the present invention, the signal separation unit includes an expected value [Equation (28)] of complete data composed of the output signal and the residual signal when the observation signal and the model parameter are given, The observation signal is separated into an output signal and a residual signal using autocorrelation [Equation (29)] of complete data.

本発明は、観測信号を、入力信号とフィルタ特性を表すインパルス応答信号との畳みこみによって得られる信号生成系の出力信号と残差信号との和で表し、前記観測信号から、前記入力信号を表すモデルを構成する入力信号パラメータと、前記フィルタ特性を表すモデルを構成するフィルタ特性パラメータと、前記残差信号を表すモデルを構成する残差信号パラメータとを推定することにより時系列信号の動特性特徴を抽出する信号解析装置における信号解析方法であって、前記観測信号から前記入力信号パラメータと前記フィルタ特性パラメータと前記残差信号パラメータの初期値を生成するパラメータ初期値生成ステップと、前記フィルタ特性パラメータと前記入力信号パラメータと前記残差信号パラメータとの組をモデルパラメータとし、前記モデルパラメータを用いて、前記観測信号を、前記入力信号パラメータと前記フィルタ特性パラメータによって構成される前記信号生成系の出力信号と前記残差信号パラメータによって構成される前記残差信号とに分離する信号分離ステップと、前記観測信号と、前記モデルパラメータと、前記出力信号および前記残差信号の組とが与えられたときの対数尤度関数の条件付き期待値に前記モデルパラメータの事前確率を足し合わせて得られる関数を目的関数として、当該目的関数を前記モデルパラメータに関して最大化するように前記モデルパラメータを更新するモデルパラメータ更新ステップと、前記モデルパラメータが所定の基準を満たしているか否かを判定し、所定の基準を満たしていないと判定された場合に所定の基準を満たすまで、前記信号分離ステップと前記モデルパラメータ更新ステップによる処理を再度行わせるパラメータ収束判定ステップと、前記パラメータ収束判定ステップにより前記モデルパラメータが所定の規準を満たすと判定された場合に、当該モデルパラメータを出力するパラメータ出力ステップとを有することを特徴とする。 The present invention represents an observation signal as a sum of an output signal of a signal generation system obtained by convolution of an input signal and an impulse response signal representing a filter characteristic and a residual signal, and the input signal is expressed from the observation signal. The dynamic characteristics of the time-series signal by estimating the input signal parameters constituting the model to be represented, the filter characteristic parameters constituting the model representing the filter characteristics, and the residual signal parameters constituting the model representing the residual signal A signal analysis method in a signal analysis apparatus for extracting features, the parameter initial value generating step for generating initial values of the input signal parameter, the filter characteristic parameter, and the residual signal parameter from the observed signal, and the filter characteristic A set of a parameter, the input signal parameter, and the residual signal parameter is a model parameter. The observation signal is separated into the output signal of the signal generation system configured by the input signal parameter and the filter characteristic parameter and the residual signal configured by the residual signal parameter using the model parameter. A priori probability of the model parameter to a conditional expectation value of a log-likelihood function when given a signal separation step, the observed signal, the model parameter, and a set of the output signal and the residual signal. A model parameter update step of updating the model parameter so as to maximize the objective function with respect to the model parameter, using the function obtained by addition as an objective function, and whether or not the model parameter satisfies a predetermined criterion If it is determined that the predetermined standard is not satisfied, the predetermined standard is satisfied. Until it is determined that the model parameter satisfies a predetermined criterion by the parameter convergence determination step that causes the signal separation step and the model parameter update step to be performed again, and the parameter convergence determination step, the model parameter And a parameter output step for outputting.

本発明は、観測信号を、入力信号とフィルタ特性を表すインパルス応答信号との畳みこみによって得られる信号生成系の出力信号と残差信号との和で表し、前記観測信号から、前記入力信号を表すモデルを構成する入力信号パラメータと、前記フィルタ特性を表すモデルを構成するフィルタ特性パラメータと、前記残差信号を表すモデルを構成する残差信号パラメータとを推定することにより時系列信号の動特性特徴を抽出する信号解析装置上のコンピュータに信号解析を行わせる信号解析プログラムであって、前記観測信号から前記入力信号パラメータと前記フィルタ特性パラメータと前記残差信号パラメータの初期値を生成するパラメータ初期値生成ステップと、前記フィルタ特性パラメータと前記入力信号パラメータと前記残差信号パラメータとの組をモデルパラメータとし、前記モデルパラメータを用いて、前記観測信号を、前記入力信号パラメータと前記フィルタ特性パラメータによって構成される前記信号生成系の出力信号と前記残差信号パラメータによって構成される前記残差信号とに分離する信号分離ステップと、前記観測信号と、前記モデルパラメータと、前記出力信号および前記残差信号の組とが与えられたときの対数尤度関数の条件付き期待値に前記モデルパラメータの事前確率を足し合わせて得られる関数を目的関数として、当該目的関数を前記モデルパラメータに関して最大化するように前記モデルパラメータを更新するモデルパラメータ更新ステップと、前記モデルパラメータが所定の基準を満たしているか否かを判定し、所定の基準を満たしていないと判定された場合に所定の基準を満たすまで、前記信号分離ステップと前記モデルパラメータ更新ステップによる処理を再度行わせるパラメータ収束判定ステップと、前記パラメータ収束判定ステップにより前記モデルパラメータが所定の規準を満たすと判定された場合に、当該モデルパラメータを出力するパラメータ出力ステップとを前記コンピュータに行わせることを特徴とする。 The present invention represents an observation signal as a sum of an output signal of a signal generation system obtained by convolution of an input signal and an impulse response signal representing a filter characteristic and a residual signal, and the input signal is expressed from the observation signal. The dynamic characteristics of the time-series signal by estimating the input signal parameters constituting the model to be represented, the filter characteristic parameters constituting the model representing the filter characteristics, and the residual signal parameters constituting the model representing the residual signal A signal analysis program for causing a computer on a signal analysis device for extracting features to perform signal analysis, wherein parameter initials for generating initial values of the input signal parameter, the filter characteristic parameter, and the residual signal parameter from the observed signal A value generation step, the filter characteristic parameter, the input signal parameter, and the residual signal parameter. A set with a meter is used as a model parameter, and the observation signal is configured by the output signal of the signal generation system configured by the input signal parameter and the filter characteristic parameter and the residual signal parameter by using the model parameter. A conditional expectation value of a log-likelihood function when given a signal separation step for separating the residual signal, the observed signal, the model parameter, and the set of the output signal and the residual signal. A function obtained by adding the prior probabilities of the model parameters to the objective function, a model parameter updating step for updating the model parameter so as to maximize the objective function with respect to the model parameter; and Judgment is made whether or not the standard is satisfied, and the predetermined standard is not satisfied When it is determined that the model parameter satisfies the predetermined criterion, the parameter separation determining step for performing again the processing by the signal separation step and the model parameter updating step until the predetermined criterion is satisfied, and the model parameter satisfies the predetermined criterion by the parameter convergence determining step. If it is determined, the computer is caused to perform a parameter output step of outputting the model parameter.

本発明によれば、入力となる階段状信号および線形２次系の制御パラメータがいずれも未知の下で、観測される基本周波数系列のみからモデルパラメータを精度よく推定することができるという効果が得られる。 According to the present invention, it is possible to accurately estimate a model parameter from only an observed fundamental frequency sequence when both the input stepped signal and the control parameter of the linear secondary system are unknown. It is done.

線形２次系モデルの概念を示す説明図である。It is explanatory drawing which shows the concept of a linear secondary system model. 信号生成系の概念を示す説明図である。It is explanatory drawing which shows the concept of a signal generation system. 本発明の第１の実施形態における歌声の基本周波数系列に対する信号解析装置の構成を示すブロック図である。It is a block diagram which shows the structure of the signal analysis apparatus with respect to the fundamental frequency series of the singing voice in the 1st Embodiment of this invention. 歌声の基本周波数系列に対する信号分解モデルを示す説明図である。It is explanatory drawing which shows the signal decomposition model with respect to the fundamental frequency series of a singing voice. 本発明の第５の実施形態における音声信号のメル周波数ケプストラム係数（ＭＦＣＣ）系列に対する信号解析装置の構成を示すブロック図である。It is a block diagram which shows the structure of the signal analysis apparatus with respect to the mel frequency cepstrum coefficient (MFCC) series of the audio | voice signal in the 5th Embodiment of this invention. 第４の実施形態におけるモデルパラメータによって生成される信号μと観測基本周波数系列ｏとの比較結果を示す図である。It is a figure which shows the comparison result of signal (micro | micron | mu) produced | generated by the model parameter in 4th Embodiment, and the observation fundamental frequency series o. 第４の実施形態によって推定されたζとΩの歌唱者ごとの平均値を示す図である。It is a figure which shows the average value for every singer of (zeta) and (omega) estimated by 4th Embodiment. 第４の実施形態によって推定されたｕの歌唱者ごとの頻度分布を示す図である。It is a figure which shows frequency distribution for every song person of u estimated by 4th Embodiment.

以下、図面を参照して、本発明の一実施形態による信号解析装置を説明する。まず、本発明の信号解析装置において、入力となる階段状信号および線形２次系の制御パラメータがいずれも未知の下で、観測される基本周波数系列のみからモデルパラメータを精度よく推定する原理について説明する。 Hereinafter, a signal analyzing apparatus according to an embodiment of the present invention will be described with reference to the drawings. First, in the signal analysis apparatus of the present invention, the principle of accurately estimating model parameters from only the observed fundamental frequency sequence will be described under the assumption that both the input stepwise signal and the control parameter of the linear quadratic system are unknown. To do.

図１は、線形２次系モデルの概念を示す説明図である。図１に示すように、観測信号ｏ（ｔ）を複数の区分に分割し、それぞれの区分における観測信号を、ステップ信号（入力信号）ｆ（ｔ）、フィルタ特性（線形２次系のインパルス応答）ｈ（ｔ）、残差信号（剰余信号）ε（ｔ）の３つの要素に分解するものとする。 FIG. 1 is an explanatory diagram showing the concept of a linear quadratic system model. As shown in FIG. 1, the observation signal o (t) is divided into a plurality of sections, and the observation signal in each section is converted into a step signal (input signal) f (t) and a filter characteristic (an impulse response of a linear secondary system). ) H (t) and residual signal (residue signal) ε (t).

ここで、図２を参照して、入力信号ｆ（ｔ）と観測信号ｏ（ｔ）との関係を説明する。図２は、信号生成系の概念を示す説明図である。入力信号ｆ（ｔ）は、目標値からなる信号を意味する。線形の２次系フィルタｈ（ｔ）は、２次系に従う、観測信号の時間的な立ち上がりやオーバーシュートなどの動特性を表現する。線形２次系フィルタの出力信号ｙ（ｔ）は、ｆ（ｔ）とｈ（ｔ）の畳み込み

によって計算される系の出力信号を表す。残差信号ε（ｔ）は観測信号と出力信号の残差信号を表す。ただし、観測信号は離散的な信号系列であるため、これらの要素に分解するために、式（１）の２次系の伝達関数を離散時間表現する４つの手法のいずれかを用いる。 Here, the relationship between the input signal f (t) and the observation signal o (t) will be described with reference to FIG. FIG. 2 is an explanatory diagram showing the concept of the signal generation system. The input signal f (t) means a signal composed of a target value. The linear secondary system filter h (t) expresses dynamic characteristics such as temporal rise of an observation signal and overshoot according to the secondary system. The output signal y (t) of the linear second-order filter is a convolution of f (t) and h (t)

Represents the output signal of the system calculated by The residual signal ε (t) represents the residual signal between the observation signal and the output signal. However, since the observation signal is a discrete signal sequence, any one of four methods for expressing the transfer function of the second-order system of Expression (1) in discrete time is used to decompose it into these elements.

ここで、線形２次系の伝達関数を離散時間表現する４つの手法について説明する。まず、差分法による離散時間表現法（手法１という）について説明する。式（１）をラプラス逆変換すると、

となり、ｆ（ｔ）とｙ（ｔ）の関係を表す２階微分方程式が導かれる。ここでｆ（ｔ）とｙ（ｔ）をサンプリング周期Δで離散化して、

と表し（Ｎは信号の長さを表す）、差分法を利用すると、

のように各時刻の微分値を近似できる。 Here, four methods for expressing the transfer function of the linear quadratic system in discrete time will be described. First, the discrete time expression method (referred to as Method 1) by the difference method will be described. When formula (1) is inversely transformed by Laplace,

Thus, a second-order differential equation representing the relationship between f (t) and y (t) is derived. Here, f (t) and y (t) are discretized with a sampling period Δ,

(N represents the length of the signal), and using the difference method,

The differential value at each time can be approximated as follows.

この近似を利用すると式（３）は、

と書ける。ここで、（ａＡ＋ｂＢ＋Ｃ）が２次系の逆フィルタのインパルス応答に相当し、Ａ、Ｂ、ＣはそれぞれＮ×Ｎの行列であり、

と書ける。 Using this approximation, equation (3) becomes

Can be written. Here, (aA + bB + C) corresponds to the impulse response of the inverse filter of the second-order system, and A, B, and C are each an N × N matrix,

Can be written.

もちろん、差分法は、式（４）以外にも複数の近似方法（中央差分、前進差分、後退差分、Ｓｉｎｃ関数利用など）を用いることができ、結果的にＡ、Ｂ、Ｃそれぞれの行列の構成を変更することに相当する。手法１では、式（５）のａ、ｂがフィルタ特性パラメータとなる。 Of course, the difference method can use a plurality of approximation methods (center difference, forward difference, backward difference, use of Sinc function, etc.) in addition to equation (4). This corresponds to changing the configuration. In Method 1, a and b in Expression (5) are filter characteristic parameters.

次に、自己回帰モデルに基づく離散時間表現法（手法２という）について説明する。式（１）のｓ領域からｚ領域への写像を考える。

と近似すると、ｓ領域とｚ領域の関係は、

となる。式（１）の逆フィルタの伝達関数はｚ領域で、

と書ける。 Next, a discrete time expression method (referred to as method 2) based on the autoregressive model will be described. Consider the mapping from the s region to the z region in equation (1).

And the relationship between the s region and the z region is

It becomes. The transfer function of the inverse filter of equation (1) is in the z region,

Can be written.

これは２次の自己回帰モデルと同形であり、

とすると、式（８）は、

と書ける。Ｕ_０＝Ｉ_Ｎ（Ｉ_ＮはＮ×Ｎの単位行列）であり、Ｕ_１、Ｕ_２はそれぞれ

からなるＮ×Ｎの行列とする。手法２では、式（１０）のａ_０，ａ_１，ａ_２がフィルタ特性パラメータとなる。 This is isomorphic to the second-order autoregressive model,

Then, equation (8) becomes

Can be written. U ₀ = I _N (I _N is an N × N unit matrix), and U ₁ and U ₂ are respectively

An N × N matrix consisting of In Method 2, a ₀ , a ₁ , and a ₂ in Expression (10) are filter characteristic parameters.

次に、複数の振動基底の線形和に基づく離散時間表現法（手法３、手法４という）について説明する。式（１）の伝達関数のラプラス逆変換によって得られるインパルス応答は、ζの値によって、以下のように場合分けされる。

Next, a discrete time expression method (method 3 and method 4) based on a linear sum of a plurality of vibration bases will be described. The impulse response obtained by the Laplace inverse transform of the transfer function of Equation (1) is classified as follows according to the value of ζ.

これらのインパルス応答をサンプリング周期Δに基づいて離散化すると、系の入出力関係は、ｙ＝Φｆのような形式で記述できる。例えば、ζ＝１の場合、Φは下三角行列

で書ける。ただし、ｈ（ｔ）は式（１２）のように複数の場合からなるので、行列Φを以下のように構成する。

When these impulse responses are discretized based on the sampling period Δ, the input / output relationship of the system can be described in a format such as y = Φf. For example, when ζ = 1, Φ is a lower triangular matrix

You can write in However, since h (t) consists of a plurality of cases as shown in Equation (12), the matrix Φ is configured as follows.

ここでは、予め手動でζ、Ωを決定し、Ｉ個の振動現象を表すインパルス応答｛Φ^（１），Φ^（２），・・・，Φ^（Ｉ）｝を計算する。そして、これらの逆行列Υ^（ｉ）：＝（Φ^（ｔ））^−１（逆フィルタのインパルス応答を表す。これらを以後、振動基底と呼ぶ）の線形重み付き和で、Φ^−１を近似する。手法３では、式（１４）のｗ：＝｛ｗ_１，ｗ_２，・・・，ｗ_Ｉ｝をフィルタ特性パラメータとし、これらのパラメータを回帰問題の枠組みで推定するものとする。 Here, ζ and Ω are manually determined in advance, and impulse responses {Φ ⁽¹⁾ , Φ ⁽²⁾ ,..., Φ ^(I) } representing I vibration phenomena are calculated. Then, Φ ⁻¹ is approximated by a linear weighted sum of these inverse matrices ^{ｉ (i)} : = (Φ ^(t) ) ⁻¹ (representing the impulse response of the inverse filter. These are hereinafter referred to as vibration bases). To do. In Method 3, w: = {w ₁ , w ₂ ,..., W _I } in Expression (14) is a filter characteristic parameter, and these parameters are estimated in the framework of a regression problem.

一方、手法４では、フィルタ特性パラメータｗ：＝｛ｗ_１，ｗ_２，・・・，ｗ_Ｉ｝がスパースなものとする。これは、Φ^−１が、ほんの数個の振動基底のみによって表現されることを意味する。後に説明するが、これはｗの事前確率を想定することで容易に実現されうる。 On the other hand, in the method 4, it is assumed that the filter characteristic parameter w: = {w ₁ , w ₂ ,..., W _I } is sparse. This means that Φ ⁻¹ is represented by only a few vibration bases. As will be described later, this can be easily realized by assuming a prior probability of w.

系のインパルス応答ΦをΦ^（ｉ）の線形重み付き和で表現してもよいが、Υ^（ｉ）の線形重み付き和で逆フィルタのインパルス応答Φ^−１を表現した理由は、後に説明するパラメータ学習アルゴリズムの導出の複雑さを解消するためである。それゆえに、系の入出力関係は、

と表現される。 The impulse response Φ of the system may be expressed by a linear weighted sum of Φ ⁽ⁱ⁾ , but the reason why the impulse response Φ ⁻¹ of the inverse filter is expressed by the linear weighted sum of Υ ⁽ⁱ⁾ will be described later. This is to eliminate the complexity of deriving the parameter learning algorithm. Therefore, the input / output relationship of the system is

It is expressed.

以上に説明した４つの手法（式（５）、式（１０）、式（１５））より、式（１）の伝達関数が、すべてΨｙ＝ｆの形式に変換された。Ψはそれぞれ、

と表現される。 From the four methods described above (formula (5), formula (10), and formula (15)), the transfer function of formula (1) is all converted into the form of Ψy = f. Ψ is respectively

It is expressed.

次に、ガウス過程に基づく線形２次系の統計的モデリングについて説明する。式（５）、式（１０）、式（１５）によって表現される系の入出力関係を、ガウス過程（文献：C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning. MIT Press, Cambridge, Mass, USA, 2006.）に基づいて統計的にモデル化する。 Next, statistical modeling of a linear quadratic system based on a Gaussian process will be described. The input / output relationship of the system expressed by Equation (5), Equation (10), and Equation (15) can be expressed as a Gaussian process (literature: CE Rasmussen and CKI Williams, Gaussian Processes for Machine Learning. MIT Press, Cambridge, Mass, USA). , 2006.) for statistical modeling.

ここで、入力ステップ信号のモデル化について説明する。入力信号ｆはステップ信号を想定する。そのために、常に同じ値をもつベクトル

を用意する。ここで、スカラー値ｕは相対音高を表すパラメータ、１_ＮはＮ個の１の値が並ぶベクトルとする。このベクトルｕを平均とする多次元ガウス分布から生成される確率変数として、入力信号ｆを確率的にモデル化する。

Here, modeling of the input step signal will be described. The input signal f is assumed to be a step signal. Therefore, a vector that always has the same value

Prepare. Here, the scalar value u is a parameter representing relative pitch, and _N is a vector in which N 1 values are arranged. The input signal f is stochastically modeled as a random variable generated from a multidimensional Gaussian distribution with the vector u as an average.

ここで、αは分散を表す超パラメータであり、あらかじめ手動で値を設定する。したがって、請求項で述べた入力信号パラメータはｕのみである。出力ｙは、ガウス分布に従う変数集合ｆの線形結合（ｙ＝Ψ^−１ｆ）であるから、ｙ自身もガウス分布に従う。その平均と共分散は、

である。よって、ｙが従う確率分布は、

と書ける。 Here, α is a super parameter representing dispersion, and is manually set in advance. Therefore, the only input signal parameter mentioned in the claims is u. Since the output y is a linear combination (y = Ψ ⁻¹ f) of the variable set f according to the Gaussian distribution, y itself also follows the Gaussian distribution. Its mean and covariance are

It is. Therefore, the probability distribution that y follows is

Can be written.

このモデルはガウス過程の１つの例となっていることは特筆すべき点である。ガウス過程の重要な点は、ｙのＮ個の要素の同時分布が、平均と共分散といった２次までの統計量で完全に記述される点である。通常のガウス過程では、平均はたいていの場合、零とすることが多く、共分散はカーネル行列の線形和（Multiple Kernel Learning；文献：F. R. Bach, G. R. G. Lanckriet, and M. I. Jordan, “Multiple kernel learning, conic duality, and the smo algorithm," in Proc. ICML 2004, pp. 6-13, July 2004）によって構成されるのが一般的である。この技術により、複数の観測値（ここでは、ｙの要素）がそれぞれ独立同分布に従うと仮定するのではなく、観測値間の相関を考慮できる。このMultiple Kernel Learningを利用したガウス過程による信号モデリングは機械学習の分野で近年注目を集めている。 It should be noted that this model is an example of a Gaussian process. The important point of the Gaussian process is that the simultaneous distribution of N elements of y is completely described by statistics up to second order such as mean and covariance. In normal Gaussian processes, the mean is often zero, and the covariance is the linear sum of kernel matrices (Multiple Kernel Learning; literature: FR Bach, GRG Lanckriet, and MI Jordan, “Multiple kernel learning, conic In general, it is composed of duality, and the smo algorithm, "in Proc. ICML 2004, pp. 6-13, July 2004). With this technique, rather than assuming that a plurality of observed values (in this case, elements of y) follow independent and identical distributions, the correlation between the observed values can be considered. In recent years, signal modeling using Gaussian processes using Multiple Kernel Learning has attracted attention in the field of machine learning.

これに対し、前述したモデルでは、式（２１）の平均と共分散の項に行列Ψ^−１を含む。そして、式（１６）のように、Ψがいくつかの基底の線形和で表現されている。したがって、Multiple Kernel Learningとは異なる、特別なガウス過程に基づく信号モデルとここで言及する。 On the other hand, in the above-described model, the matrix Ψ ⁻¹ is included in the mean and covariance terms of Equation (21). Then, as in Expression (16), Ψ is expressed by a linear sum of several bases. Therefore, here we refer to a signal model based on a special Gaussian process, which is different from Multiple Kernel Learning.

次に、尤度関数と事前確率について説明する。ガウス性白色雑音に従う残差信号

を導入し、観測信号

は、系の出力信号ｙに残差信号εが加わった信号

と仮定する。 Next, the likelihood function and the prior probability will be described. Residual signal subject to Gaussian white noise

Introduced the observation signal

Is a signal obtained by adding a residual signal ε to the output signal y of the system

Assume that

ここで、βは残差信号の分散を表す超パラメータとし、これが請求項でいう残差信号パラメータに相当する。ｙとεは互いに独立であると仮定すると、ガウス過程の定義より、観測信号ｏが与えられたときのモデルパラメータθ＝｛Ψ，ｕ，β｝の対数尤度関数は、

となる。ここで、μ：＝Ψ^−１ｕ，Σ：＝αΨ^−１（Ψ^−１）^Ｔ＋βＩ_Ｎとする。 Here, β is a super parameter representing the variance of the residual signal, which corresponds to the residual signal parameter in the claims. Assuming that y and ε are independent from each other, from the definition of the Gaussian process, the log likelihood function of the model parameter θ = {Ψ, u, β} when the observation signal o is given is

It becomes. ^{Here, μ: = Ψ -1 u,} Σ: = αΨ -1 (Ψ -1) and ^T + beta I _N.

Θの事前確率Ｐ（Θ）は、各要素の独立性Ｐ（Θ）＝Ｐ（Ψ）Ｐ（ｕ）Ｐ（β）を仮定し、ｕとβはそれぞれ一様分布に従うものとする。Ｐ（Ψ）は、前述した手法１〜４のフィルタ特性パラメータの独立性を仮定して、

とし、Ｐ（ａ）、Ｐ（ｂ）、Ｐ（ａ_０）、Ｐ（ａ_１）、Ｐ（ａ_２）はそれぞれ一様分布を仮定する。手法３では、同様にＰ（ｗ_１），Ｐ（ｗ_２），・・・，Ｐ（ｗ_Ｉ）はそれぞれ一様分布を仮定する。一方、手法４では、パラメータｗにスパースな制約をもたせるために、一般化正規分布

に従うものとする。ただし、ｐ、λは一般化正規分布の形状を規定する定数であり、０＜ｐ＜２のときｐ（ｗ）は優ガウス的となり、スパースネスを測るための尺度となる。 The prior probability P (Θ) of Θ assumes the independence P (Θ) = P (ψ) P (u) P (β) of each element, and u and β are assumed to follow a uniform distribution. P (Ψ) assumes the independence of the filter characteristic parameters of the methods 1 to 4 described above,

And P (a), P (b), P (a ₀ ), P (a ₁ ), and P (a ₂ ) are assumed to have a uniform distribution. In method 3, similarly, P (w ₁ ), P (w ₂ ),..., P (w _I ) are assumed to have a uniform distribution. On the other hand, in Method 4, a generalized normal distribution is used in order to place a sparse constraint on the parameter w.

Shall be followed. However, p and λ are constants that define the shape of the generalized normal distribution. When 0 <p <2, p (w) becomes dominant Gaussian and is a scale for measuring sparseness.

次に、ＥＭ法に基づくパラメータ学習アルゴリズムについて説明する。基本周波数系列ｏが与えられたときに、事後確率Ｐ（Θ｜ｏ）∝Ｐ（ｏ｜Θ）Ｐ（Θ）を最大化するパラメータΘの推定値を決定したい。しかしながら、Θの事後（ＭＡＰ）推定値に関する最適解を解析的に求めることは難しい。その理由は、観測信号ｏが出力信号ｙと残差信号εの和で構成される、尤度関数が、Ψを構成するフィルタ特性パラメータに関して非線形となるなどが挙げられる。ここでは、それぞれの問題に対処するために、次の２つの方策を適用する。 Next, a parameter learning algorithm based on the EM method will be described. Given a fundamental frequency sequence o, we want to determine an estimate of the parameter Θ that maximizes the posterior probability P (Θ | o) ∝P (o | Θ) P (Θ). However, it is difficult to analytically determine the optimal solution for the posterior (MAP) estimate of Θ. The reason is that the observation signal o is composed of the sum of the output signal y and the residual signal ε, and the likelihood function is non-linear with respect to the filter characteristic parameter constituting Ψ. Here, in order to deal with each problem, the following two measures are applied.

１つ目は、ＥＭ法のＥ−ｓｔｅｐで、ｏをｙとεに分配することである。２つ目は、Ｍ−ｓｔｅｐに補助関数法（文献：H. Kameoka, N. Ono, K. Kashino, and S. Sagayama, “Complex NMF: A New Sparse Representation for Acoustic Signals," in Proc. ICASSP 2009, pp. 3437-3440, April 2009）を適用して、Ｑ関数の補助関数を設計することである。 The first is to distribute o to y and ε in E-step of EM method. The second is an auxiliary function method (reference: H. Kameoka, N. Ono, K. Kashino, and S. Sagayama, “Complex NMF: A New Sparse Representation for Acoustic Signals,” in Proc. ICASSP 2009 , pp. 3437-3440, April 2009) to design the auxiliary function of the Q function.

次に、完全データの定義について説明する。このＭＡＰ推定問題にＥＭ法を適用する際の最初のステップは完全データを定義することである。ここでは、ｙとεを完全データｘと見なして、ＥＭ法を適用する。不完全データと完全データの関係を、

と表現する。ｘと現在のパラメータΘ´が与えられたときの、対数尤度関数の条件付き期待値を計算し、さらにｌｏｇＰ（Θ）を加算すると、以下のようなＱ関数を得る。

Next, the definition of complete data will be described. The first step in applying the EM method to this MAP estimation problem is to define complete data. Here, y and ε are regarded as complete data x, and the EM method is applied. The relationship between incomplete data and complete data

It expresses. When the conditional expected value of the log likelihood function when x and the current parameter Θ ′ are given is calculated and logP (Θ) is added, the following Q function is obtained.

なお、＝の上にｃのついた記号は、定数項を除いて左辺と右辺が等しいことを表し、

である。ここで、ｔｒ（・）は行列のトレースを表し、条件付ガウス分布の性質から、

と書ける。ここで、式（２８）は、不完全データｏ（観測信号）と現在のパラメータが与えられた時の、完全データｘ（出力信号と残差信号で構成される）の期待値であり、式（２９）は、不完全データｏ（観測信号）と現在のパラメータが与えられた時の、完全データｘ（出力信号と残差信号で構成される）の自己相関である。 Note that the symbol with c above = indicates that the left side and the right side are equal except for the constant term.

It is. Where tr (•) represents the trace of the matrix, and from the nature of the conditional Gaussian distribution,

Can be written. Here, Expression (28) is an expected value of complete data x (consisting of an output signal and a residual signal) when incomplete data o (observation signal) and the current parameter are given. (29) is an autocorrelation of complete data x (consisting of an output signal and a residual signal) when incomplete data o (observation signal) and the current parameter are given.

ＥＭ法のＥ−ｓｔｅｐでは、直前に更新されたモデルパラメータをΘ´に代入し、Ｅ［ｘ｜ｏ；Θ´］とＥ［ｘｘ^Ｔ｜ｏ；Θ´］を計算する。後の計算のため、ｙ、εに対応するようにＥ［ｘ｜ｏ；Θ´］とＥ［ｘｘ^Ｔ｜ｏ；Θ´］を

のように区分表現する。￣ｘ_ｙと￣ｘ_εは（￣はそれぞれｘの頭に付く、以下の説明においても同様）はどちらも長さＮのベクトルであり、Ｒ_ｙとＲ_εはどちらもＮ×Ｎの正方行列を表す。 In the E-step of the EM method, the model parameter updated immediately before is substituted into Θ ′, and E [x | o; Θ ′] and E [xx ^T | o; Θ ′] are calculated. For later calculations, let E [x | o; Θ ′] and E [xx ^T | o; Θ ′] correspond to y and ε.

It is divided and expressed as ￣ _xy and ￣ x _ε (￣ is prefixed with x, respectively in the following description) are both vectors of length N, and R _y and R _ε are both N × N square matrices. Represents.

次に、Ｑ関数を前述の手法１〜手法４の各モデルパラメータに関して、最大化するＭ−ｓｔｅｐ更新式について説明する。まず、手法１のＭ−ｓｔｅｐ更新式について説明する。Ψ＝ａＡ＋ｂＢ＋Ｃなので、式（２７）から関連する項を取り出すと、最大化する目的関数は

と改めて定義される。ここで、Ａ_ｎ，ｎとは行列Ａのｎ行ｎ列目の要素を表す。パラメータ集合Θに関する最大化問題を解く更新式を前述の補助関数法により導く。式（３１）で与えられる目的関数の補助関数を以下の不等式を用いて導く。

ここで、等号成立は、

の場合である。補助変数γ_１＝｛γ_ａ，１，γ_ｂ，１，γ_ａ，２，γ_ｂ，２，・・・，γ_ａ，Ｎ，γ_ｂ，Ｎ｝を導入すると、補助関数は、

とすれば、ｆ_１（ａ，ｂ，ｕ，β）≧ｆ_１ ^＋（ａ，ｂ，ｕ，β，γ_１）が成り立ち、式（３３）が成立するとき、

となるため、ｆ_１ ^＋（ａ，ｂ，ｕ，β，γ_１）は補助関数の定義を満たす。 Next, an M-step update formula for maximizing the Q function for each model parameter of the above-described methods 1 to 4 will be described. First, the M-step update formula of Method 1 will be described. Since Ψ = aA + bB + C, taking the relevant term from Equation (27), the objective function to be maximized is

Is redefined. Here, A _{n, n} represents an element in the nth row and the nth column of the matrix A. An update formula for solving the maximization problem with respect to the parameter set Θ is derived by the auxiliary function method described above. The auxiliary function of the objective function given by Equation (31) is derived using the following inequality.

Here, the establishment of the equal sign is

This is the case. When the auxiliary variable γ ₁ = {γ _{a, 1} , γ _{b, 1} , γ _{a, 2} , γ _{b, 2} ,..., Γ _{a, N} , γ _{b, N} } is introduced, the auxiliary function is

Then, f ₁ (a, b, u, β) ≧ f ₁ ⁺ (a, b, u, β, γ ₁ ) holds, and when equation (33) holds,

Therefore, f ₁ ⁺ (a, b, u, β, γ ₁ ) satisfies the definition of the auxiliary function.

そこで、ｆ_１ ^＋（ａ，ｂ，ｕ，β，γ_１）をａ、ｂそれぞれに関して微分して０とおくと、

を得る。式（５）から分かるように、ａとｂは共に正の値となる制約の下で、Coordinate descent法を適用して、式（３６）と式（３７）からなる連立方程式を解き、ａとｂを求める。 Therefore, if f ₁ ⁺ (a, b, u, β, γ ₁ ) is differentiated with respect to each of a and b and set to 0,

Get. As can be seen from Equation (5), under the constraint that both a and b are positive values, the Coordinate descent method is applied to solve the simultaneous equations of Equation (36) and Equation (37), and a and b is obtained.

一方、ｆ_１ ^＋（ａ，ｂ，ｕ，β，γ_１）をｕ、βそれぞれに関して微分して０とおくと、

を得る。これらの式を用いて、ｕとβを更新する。 On the other hand, if f ₁ ⁺ (a, b, u, β, γ ₁ ) is differentiated with respect to u and β, and is set to 0,

Get. Using these equations, u and β are updated.

以上の手法１に関する、ＥＭ法に基づくパラメータ学習アルゴリズムをまとめると、以下のようになる。
初期化：パラメータΘ＝｛ａ，ｂ，u，β｝に初期値を与える。Ｅ−ｓｔｅｐ：Ｅ［ｘ｜ｏ；Θ´］、Ｅ［ｘｘ^Ｔ｜ｏ；Θ´］、γ_１＝｛γ_ａ，１，γ_ｂ，１，γ_ａ，２，γ_ｂ，２，・・・，γ_ａ，Ｎ，γ_ｂ，Ｎ｝の更新。Ｍ−ｓｔｅｐ：式（３６）、（３７）、（３８）より、モデルパラメータΘ＝｛ａ，ｂ，ｕ，β｝の更新。収束判定：式（３４）の補助関数の値が収束していなければ、Θ´＝ΘとしてＥ−ｓｔｅｐに戻る。 The parameter learning algorithm based on the EM method related to the above method 1 is summarized as follows.
Initialization: An initial value is given to the parameter Θ = {a, b, u, β}. E-step: E [x | o; Θ ′], E [xx ^T | o; Θ ′], γ ₁ = {γ _{a, 1} , γ _{b, 1} , γ _{a, 2} , γ _{b, 2} ,. .., γ _{a, N} , γ _{b, N} } update. M-step: Update of model parameter Θ = {a, b, u, β} from equations (36), (37), and (38). Convergence determination: If the value of the auxiliary function of Expression (34) has not converged, return to E-step as Θ ′ = Θ.

次に、手法２のＭ−ｓｔｅｐ更新式について説明する。Ψ＝ａ_０Ｕ_０＋ａ_１Ｕ_１＋ａ_２Ｕ_２なので、式（２７）から関連する項を取り出すと、最大化する目的関数は、

と改めて定義される。ｆ_２（ａ_０，ａ_１，ａ_２，ｕ，β）をａ_０，ａ_１，ａ_２に関してそれぞれ微分して０とおくと，

を得る。式（９）に示すように、ａ_０とａ_２は正の値、ａ_１は負の値となる制約の下で、Coordinate descent法を適用して、式（４０）、（４１）、（４２）からなる連立方程式を解き、ａ_０，ａ_１，ａ_２を求める。 Next, the M-step update formula of Method 2 will be described. Since Ψ = a ₀ U ₀ + a ₁ U ₁ + a ₂ U ₂ , when the relevant terms are taken from equation (27), the objective function to be maximized is

Is redefined. If f ₂ (a ₀ , a ₁ , a ₂ , u, β) is differentiated with respect to a ₀ , a ₁ , a ₂ and set to 0,

Get. As shown in Expression (9), the Coordinate descent method is applied under the constraint that a ₀ and a ₂ are positive values and a ₁ is a negative value, and Expressions (40), (41), ( 42) are solved to obtain a ₀ , a ₁ , and a ₂ .

一方、ｆ_２（ａ_０，ａ_１，ａ_２，ｕ，β）をｕ、βそれぞれに関して微分して０とおくと、

を得る。これらの式を用いて、ｕとβを更新する。 On the other hand, if f ₂ (a ₀ , a ₁ , a ₂ , u, β) is differentiated with respect to u and β, and set to 0,

Get. Using these equations, u and β are updated.

以上の手法２に関する、ＥＭ法に基づくパラメータ学習アルゴリズムをまとめると、以下のようになる。初期化：パラメータΘ＝｛ａ_０，ａ_１，ａ_２，ｕ，β｝に初期値を与える。Ｅ−ｓｔｅｐ：Ｅ［ｘ｜ｏ；Θ´］、Ｅ［ｘｘ^Ｔ｜ｏ；Θ´］の更新。
Ｍ−ｓｔｅｐ：式（４０）、（４１）、（４２）、（４３）より、モデルパラメータΘ＝｛ａ_０，ａ_１，ａ_２，ｕ，β｝の更新。収束判定：式（３９）の目的関数の値が収束していなければ、Θ´＝ΘとしてＥ−ｓｔｅｐに戻る。 The parameter learning algorithm based on the EM method related to the above method 2 is summarized as follows. Initialization: An initial value is given to the parameter Θ = {a ₀ , a ₁ , a ₂ , u, β}. E-step: Update of E [x | o; Θ ′], E [xx ^T | o; Θ ′].
M-step: Update of the model parameter Θ = {a ₀ , a ₁ , a ₂ , u, β} from the equations (40), (41), (42), (43). Convergence determination: If the value of the objective function in Expression (39) has not converged, return to E-step as Θ ′ = Θ.

次に、手法３のＭ−ｓｔｅｐ更新式について説明する。Ψ＝ｗ_１Υ^（１）＋ｗ_２Υ^（２）＋・・・＋ｗ_ＩΥ^（Ｉ）なので、式（２７）から関連する項を取り出すと、最大化する目的関数は、

と改めて定義される。ここで、Υ_ｎ，ｎ ^（ｉ）はΥ^（ｉ）のｎ行ｎ列目の対角要素を表す。この最大化問題を解く更新式を補助関数法により導く。式（４４）で与えられる目的関数の補助関数を以下の不等式を用いて導く。

Next, the M-step update formula of Method 3 will be described. Since Ψ = w ₁ Υ ⁽¹⁾ + w ₂ Υ ⁽²⁾ +... + W _I Υ ^(I) , when the relevant term is taken from equation (27), the objective function to be maximized is

Is redefined. Here, Υ _{n, n} ⁽ⁱ⁾ represents the diagonal element of the n-th row and the n-th column of Υ ⁽ⁱ⁾ . An update formula that solves this maximization problem is derived by the auxiliary function method. The auxiliary function of the objective function given by Expression (44) is derived using the following inequality.

ここで、補助変数γ_３＝｛γ_１，１，・・・，γ_Ｉ，Ｎ｝を定義し、補助関数を

と定義する。このとき、

が成り立ち、等号成立は、

のときであるため、式（４６）は補助関数の定義を満たす。 Here, the auxiliary variable γ ₃ = {γ _1,1 ,..., Γ _{I, N} } is defined, and the auxiliary function is defined as

It is defined as At this time,

The establishment of the equality sign

(46) satisfies the definition of the auxiliary function.

式（４６）をｗ_ｉ´に関して微分して０とおくと，

を得る。Coordinate descent法を利用して、ｉ´＝１，２，・・・，Ｉに関する非線形連立方程式を解くと、ｗを求めることができる。一方、ｆ_３ ^＋（ｗ，ｕ，β，γ_３）をｕ、βそれぞれに関して微分して０とおくと、

を得る。これらの式を用いて、ｕとβを更新する。 Differentiating equation (46) with respect to w _{i ′} and _setting it to 0,

Get. By solving the nonlinear simultaneous equations related to i ′ = 1, 2,..., Using the coordinate descent method, w can be obtained. On the other hand, if f ₃ ⁺ (w, u, β, γ ₃ ) is differentiated with respect to u and β, and set to 0,

Get. Using these equations, u and β are updated.

以上の手法３に関する、ＥＭ法に基づくパラメータ学習アルゴリズムをまとめると、以下のようになる。
初期化：パラメータΘ＝｛ｗ，ｕ，β｝に初期値を与える。Ｅ−ｓｔｅｐ：Ｅ［ｘ｜ｏ；Θ´］、Ｅ［ｘｘ^Ｔ｜ｏ；Θ´］、γ_３の更新。Ｍ−ｓｔｅｐ：式（４９）、（５０）より、モデルパラメータΘ＝｛ｗ，ｕ，β｝の更新。収束判定：式（４６）の補助関数の値が収束していなければ、Θ´＝ΘとしてＥ−ｓｔｅｐに戻る。 The parameter learning algorithm based on the EM method related to the above method 3 is summarized as follows.
Initialization: An initial value is given to the parameter Θ = {w, u, β}. E-step: Update of E [x | o; Θ ′], E [xx ^T | o; Θ ′], γ ₃ . M-step: Update of model parameter Θ = {w, u, β} from equations (49) and (50). Convergence determination: If the value of the auxiliary function in the equation (46) has not converged, return to E-step as Θ ′ = Θ.

次に、手法４のＭ−ｓｔｅｐ更新式について説明する。Ψ＝ｗ_１Υ^（１）＋ｗ_２Υ^（２）＋・・・＋ｗ_ＩΥ^（Ｉ）なので、式（２７）から関連する項を取り出すと、最大化する目的関数は、

と改めて定義される。ここで、Υ_ｎ，ｎ ^（ｉ）はΥ^（ｉ）のｎ行ｎ列目の対角要素を表す。この最大化問題を解く更新式を補助関数法により導く。式（５１）で与えられる目的関数の補助関数を以下の２つの不等式を用いて導く。

Next, the M-step update formula of Method 4 will be described. Since Ψ = w ₁ Υ ⁽¹⁾ + w ₂ Υ ⁽²⁾ +... + W _I Υ ^(I) , when the relevant term is taken from equation (27), the objective function to be maximized is

Is redefined. Here, Υ _{n, n} ⁽ⁱ⁾ represents the diagonal element of the n-th row and the n-th column of Υ ⁽ⁱ⁾ . An update formula that solves this maximization problem is derived by the auxiliary function method. The auxiliary function of the objective function given by equation (51) is derived using the following two inequalities.

ここで、補助変数￣ｗ：＝｛￣ｗ_１，￣ｗ_２，・・・，￣ｗ_Ｉ｝、γ_４＝｛γ_１，１，・・・，γ_Ｉ，Ｎ｝を定義し、補助関数を

と定義する。このとき、

が成り立ち、等号成立は、

のときであるため、式（５４）は補助関数の定義を満たす。
式（５４）をｗ_ｉ´に関して微分して０とおくと、

を得る。Coordinate descent法を利用して、ｉ´＝１，２，・・・，Ｉに関する非線形連立方程式を解くと、ｗを求めることができる。 Here, the auxiliary variable _{_{¯w: = {¯w 1, ¯w}} 2, ···, ¯w}, γ 4 = {

γ

1,1, ···, γ I, N} define the auxiliary Function

It is defined as At this time,

The establishment of the equality sign

(54) satisfies the definition of the auxiliary function.
Differentiating equation (54) with respect to w _{i ′} and _setting it to 0,

Get. By solving the nonlinear simultaneous equations related to i ′ = 1, 2,..., Using the coordinate descent method, w can be obtained.

一方、ｆ_４ ^＋（ｗ，ｕ，β，γ_４）をｕ、βそれぞれに関して微分して０とおくと、

を得る。これらの式を用いて、ｕとβを更新する。 On the other hand, when f ₄ ⁺ (w, u, β, γ ₄ ) is differentiated with respect to u and β, and set to 0,

Get. Using these equations, u and β are updated.

以上の手法４に関する、ＥＭ法に基づくパラメータ学習アルゴリズムをまとめると、以下のようになる。
初期化：パラメータΘ＝｛ｗ，ｕ，β｝に初期値を与える。Ｅ−ｓｔｅｐ：Ｅ［ｘ｜ｏ；Θ´］、Ｅ［ｘｘ^Ｔ｜ｏ；Θ´］、￣ｗ、γ_４の更新。Ｍ−ｓｔｅｐ：式（５７）、（５８）から、モデルパラメータΘ＝｛ｗ，ｕ，β｝の更新。収束判定：式（５４）の補助関数の値が収束していなければ、Θ´＝ΘとしてＥ−ｓｔｅｐに戻る。 The parameter learning algorithm based on the EM method related to the above method 4 is summarized as follows.
Initialization: An initial value is given to the parameter Θ = {w, u, β}. E-step: E [x | o; Θ'], E [xx T | o; Θ'], ¯w, γ 4 of the update. M-step: Update of the model parameter Θ = {w, u, β} from the equations (57) and (58). Convergence determination: If the value of the auxiliary function of Expression (54) has not converged, return to E-step as Θ ′ = Θ.

＜第１の実施形態＞
次に、図３を参照して、本発明の第１の実施形態による信号解析装置の構成を説明する。図３は同実施形態の構成を示すブロック図である。図３に示すように、信号解析装置は、コンピュータ装置によって構成し、基本周波数抽出部１、セグメント分割部２、パラメータ初期値生成部３、信号分離部４、フィルタ特性パラメータ更新部５、入力信号パラメータ更新部６、残差信号パラメータ更新部７、パラメータ収束判定部８およびパラメータ出力部９を備える。フィルタ特性パラメータ更新部５と、入力信号パラメータ更新部６と、残差信号パラメータ更新部７は、モデルパラメータ更新部１０を構成する。第１の実施形態は、前述した手法１を用いて信号解析を行う構成である。 <First Embodiment>
Next, the configuration of the signal analyzing apparatus according to the first embodiment of the present invention will be described with reference to FIG. FIG. 3 is a block diagram showing the configuration of the embodiment. As shown in FIG. 3, the signal analysis device is configured by a computer device, and includes a fundamental frequency extraction unit 1, a segment division unit 2, a parameter initial value generation unit 3, a signal separation unit 4, a filter characteristic parameter update unit 5, an input signal. A parameter update unit 6, a residual signal parameter update unit 7, a parameter convergence determination unit 8, and a parameter output unit 9 are provided. The filter characteristic parameter update unit 5, the input signal parameter update unit 6, and the residual signal parameter update unit 7 constitute a model parameter update unit 10. The first embodiment is configured to perform signal analysis using the method 1 described above.

基本周波数抽出部１は、入力される歌声音響信号から観測基本周波数時系列を抽出する。この処理は、周知技術により実現でき、例えば、文献：A de Cheveign´e and H. Kawahara, “YIN, a fundamental frequency estimator for speech and music," Journal of the Acoustical Society of America, vol.111, no.4, pp. 1917-1930, 2002で提案される基本周波数推定法ＹＩＮを利用して、歌声音響信号から５ｍｓごとに基本周波数を推定する。 The fundamental frequency extraction unit 1 extracts an observed fundamental frequency time series from the input singing voice acoustic signal. This processing can be realized by a well-known technique. For example, literature: A de Cheveign´e and H. Kawahara, “YIN, a fundamental frequency estimator for speech and music,” Journal of the Acoustical Society of America, vol.111, no .4, pp. 1917-1930, 2002, the fundamental frequency estimation method YIN is used to estimate the fundamental frequency from the singing voice signal every 5 ms.

セグメント分割部２は、推定された基本周波数系列をいくつかのセグメントに分割する。図４に示すように、各セグメントは、ある音高から別の音高へ立ち上がる瞬間を始点および終点とする。セグメントへの分割方法は手作業、もしくｋ−ｍｅａｎ法、またはビタビアルゴリズムなどを利用する。分割されたセグメントｏ＝［ｏ_１，ｏ_２，・・・，ｏ_Ｎ］^Ｔ（Ｎはセグメントにおける基本周波数系列の長さを表し、セグメントごとに変化する）
ごとにモデルパラメータを推定する。前処理として、セグメントの先頭の基本周波数値ｏ_１を、セグメントのすべての基本周波数値から減算し正規化を行う。 The segment dividing unit 2 divides the estimated fundamental frequency sequence into several segments. As shown in FIG. 4, each segment has a start point and an end point at the moment when it rises from one pitch to another. As a method of dividing into segments, manual work, k-mean method, Viterbi algorithm or the like is used. Divided segment o = [o ₁ , o ₂ ,..., O _N ] ^T (N represents the length of the fundamental frequency sequence in the segment and varies from segment to segment)
Model parameters are estimated every time. As preprocessing, normalization is performed by subtracting the fundamental frequency value o ₁ at the beginning of the segment from all the fundamental frequency values of the segment.

パラメータ初期値生成部３は、モデルパラメータΘ_１＝｛ａ，ｂ，ｕ，β｝の初期値を決定する。ａ，ｂは、ζ＝１．０，Ω＝０．１のときに、式（５）より計算される、ａ＝１００，ｂ＝２０を初期値とする。ｕは、観測信号ｏの要素の中央値を初期値とする。βは、β＝１００を初期値とする。これらはすべて実験的に決定する。また、αは、α＝２に固定する。 The parameter initial value generation unit 3 determines an initial value of the model parameter Θ ₁ = {a, b, u, β}. As for a and b, when ζ = 1.0 and Ω = 0.1, a = 100 and b = 20 calculated from the equation (5) are set as initial values. The initial value of u is the median value of the elements of the observation signal o. The initial value of β is β = 100. These are all determined experimentally. Α is fixed to α = 2.

信号分離部４では、ガウス過程の定義およびＥＭアルゴリズムに基づいて導出される出力信号と残差信号の期待値を信号分離基準として、観測信号を出力信号と残差信号に分離する。ここでは、現在のモデルパラメータΘ_１´＝｛ａ，ｂ，ｕ，β｝を利用して、式（２８）、（２９）を計算し、式（３０）に基づいて、Ｅ［ｘ｜ｏ；Θ_１´］とＥ［ｘｘ^Ｔ｜ｏ；Θ_１´］を￣ｘ_ｙ，￣ｘ_ε，Ｒ_ｙ，Ｒ_εに分割する。また、式（３３）に基づいて、補助変数γ_１も計算する。 The signal separation unit 4 separates the observation signal into the output signal and the residual signal using the output signal derived based on the definition of the Gaussian process and the expected value of the residual signal based on the EM algorithm as a signal separation reference. Here, equations (28) and (29) are calculated using the current model parameter Θ ₁ ′ = {a, b, u, β}, and E [x | o based on equation (30). ; Θ ₁ ′] and E [xx ^T | o; Θ ₁ ′] are divided into ￣x _y , ￣x _ε , R _y , and R _ε . Also, the auxiliary variable γ ₁ is calculated based on the equation (33).

フィルタ特性パラメータ更新部５は、フィルタ特性パラメータであるａ、ｂの値を更新する。式（５）から分かるように、ａとｂは共に正の値となる制約の下で、Coordinate descent法を適用して、式（３６）と式（３７）からなる連立方程式を解き、ａとｂを求める。まず、初期値としてａ＝０、ｂ＝０と設定する。そして、式（３６）をａに関する方程式と見なして、

を計算する。 The filter characteristic parameter updating unit 5 updates the values of a and b that are filter characteristic parameters. As can be seen from Equation (5), under the constraint that both a and b are positive values, the Coordinate descent method is applied to solve the simultaneous equations of Equation (36) and Equation (37), and a and b is obtained. First, as initial values, a = 0 and b = 0 are set. And, considering equation (36) as an equation for a,

Calculate

次に、式（３７）をｂに関する方程式と見なして、

を計算する。式（５９）、（６０）による更新を交互に繰り返し、ａとｂの値がそれぞれ変化しなくなるまで更新を続ける。 Next, considering equation (37) as an equation for b,

Calculate Updates according to equations (59) and (60) are alternately repeated until the values of a and b no longer change.

入力信号パラメータ更新部６は、フィルタ特性パラメータ更新部５で更新されたａ、ｂを利用して、式（１６）のΨを計算し、式（３８）に基づいて、入力信号パラメータｕの値を更新する。 The input signal parameter update unit 6 calculates Ψ of Expression (16) using a and b updated by the filter characteristic parameter update unit 5, and based on Expression (38), the value of the input signal parameter u Update.

残差信号パラメータ更新部７は、式（３８）に基づいて、残差信号パラメータβの値を更新する。 The residual signal parameter updating unit 7 updates the value of the residual signal parameter β based on Expression (38).

パラメータ収束判定部８は、信号分離部４で計算された￣ｘ_ｙ，Ｒ_ｙ，Ｒ_εおよび、モデルパラメータ更新部１０によってそれぞれ更新されたモデルパラメータΘ_１＝｛ａ，ｂ，ｕ，β｝を利用して、式（３１）の目的関数の値を計算する。更新前のモデルパラメータを用いて計算した式（３１）の目的関数の値と更新後のモデルパラメータを用いて計算した式（３１）の目的関数の値との誤差が、所定の閾値以下であれば、収束したと判定する。収束していればパラメータ出力部９は、モデルパラメータΘ_１＝｛ａ，ｂ，ｕ，β｝を出力した後に、次のセグメントにおけるモデルパラメータ推定に移行するため、セグメント分割部２の処理へ移行する。一方、収束しない場合は、Θ_１´＝Θ_１として信号分離部４の処理に戻る。 The parameter convergence determination unit 8 includes ￣x _y , R _y , R _ε calculated by the signal separation unit 4 and the model parameter Θ ₁ = {a, b, u, β} updated by the model parameter update unit 10, respectively. Is used to calculate the value of the objective function of equation (31). If an error between the value of the objective function of the equation (31) calculated using the model parameter before the update and the value of the objective function of the equation (31) calculated using the model parameter after the update is less than a predetermined threshold value If it is, it is determined that it has converged. If it has converged, the parameter output unit 9 outputs the model parameter Θ ₁ = {a, b, u, β} and then proceeds to the process of the segment dividing unit 2 in order to proceed to model parameter estimation in the next segment. To do. On the other hand, when it does not converge, it returns to the processing of the signal separation unit 4 as Θ ₁ ′ = Θ ₁ .

なお、収束したか否かを判定する方法としては、目的関数を用いる方法以外に、モデルパラメータ各々の値を更新前と更新後とで比較しても良いし、予め定めた繰り返し回数に到達したか否かで判定を行っても良い。 In addition to the method using the objective function, as a method for determining whether or not the convergence has occurred, the values of the model parameters may be compared before and after the update, or a predetermined number of iterations has been reached. The determination may be made based on whether or not.

＜第２の実施形態＞
次に、本発明の第２の実施形態による信号解析装置の構成を説明する。第２の実施形態は、前述した手法２を用いて信号解析を行う構成である。第２の実施形態における信号解析装置の構成は、図３に示す構成と同様であり、基本周波数抽出部１及びセグメント分割部２の処理動作は、第１の実施形態と同じである。第２の実施形態は、その他の構成の処理動作が異なる。 <Second Embodiment>
Next, the configuration of the signal analyzing apparatus according to the second embodiment of the present invention will be described. The second embodiment is configured to perform signal analysis using the method 2 described above. The configuration of the signal analyzing apparatus in the second embodiment is the same as that shown in FIG. 3, and the processing operations of the fundamental frequency extracting unit 1 and the segment dividing unit 2 are the same as those in the first embodiment. The second embodiment differs in processing operations of other configurations.

パラメータ初期値生成部３は、モデルパラメータΘ_２＝｛ａ_０，ａ_１，ａ_２，ｕ，β｝の初期値を決定する。ａ_０，ａ_１，ａ_２は、ζ＝１．０，Ω＝０．１のときに、式（９）より計算される、ａ_０＝１２１，ａ_１＝−２２０，ａ_２＝１００を初期値とする。ｕは、観測信号ｏの要素の中央値を初期値とする。βは、β＝１００を初期値とする。これらはすべて実験的に決定する。また、αは、α＝２に固定する。 The parameter initial value generation unit 3 determines an initial value of the model parameter Θ ₂ = {a ₀ , a ₁ , a ₂ , u, β}. a ₀ , a ₁ , and a ₂ are calculated from Equation (9) when ζ = 1.0 and Ω = 0.1, and a ₀ = 121, a ₁ = −220, and a ₂ = 100. Use the initial value. The initial value of u is the median value of the elements of the observation signal o. The initial value of β is β = 100. These are all determined experimentally. Α is fixed to α = 2.

信号分離部４は、ガウス過程の定義およびＥＭアルゴリズムに基づいて導出される出力信号と残差信号の期待値を信号分離基準として、観測信号を出力信号と残差信号に分離する。現在のモデルパラメータΘ_２´＝｛ａ_０，ａ_１，ａ_２，ｕ，β｝を利用して、式（２８）、（２９）を計算し、式（３０）に基づいて、Ｅ［ｘ｜ｏ；Θ_２´］とＥ［ｘｘ^Ｔ｜ｏ；Θ_２´］を￣ｘ_ｙ，￣ｘ_ε，Ｒ_ｙ，Ｒ_εに分割する。 The signal separation unit 4 separates the observation signal into the output signal and the residual signal using the output signal derived based on the definition of the Gaussian process and the expected value of the residual signal based on the EM algorithm as a signal separation reference. Equations (28) and (29) are calculated using the current model parameter Θ ₂ ′ = {a ₀ , a ₁ , a ₂ , u, β}, and E [x | O; Θ ₂ ′] and E [xx ^T | o; Θ ₂ ′] are divided into ￣ _xy , ￣ _x _ε , R _y , and R _ε .

フィルタ特性パラメータ更新部５は、フィルタ特性パラメータであるａ_０，ａ_１，ａ_２の値を更新する。式（９）から分かるように、ａ_０とａ_２は正の値、ａ_１は負の値となる制約の下で、Coordinate descent法を適用して、式（４０）、（４１）、（４２）からなる連立方程式を解き、ａ_０，ａ_１，ａ_２を求める。まず、初期値としてａ_０＝０，ａ１＝０，ａ_２＝０と設定する。そして、式（４０）をａ_０に関する方程式と見なして、

を計算する。 The filter characteristic parameter update unit 5 updates the values of a ₀ , a ₁ , and a ₂ that are filter characteristic parameters. As can be seen from equation (9), the coordinate descent method is applied under the constraint that a ₀ and a ₂ are positive values and a ₁ is a negative value, and equations (40), (41), ( 42) are solved to obtain a ₀ , a ₁ , and a ₂ . First, as initial values, a ₀ = 0, a1 = 0, and a ₂ = 0 are set. And considering equation (40) as an equation for a ₀ ,

Calculate

次に、式（４１）をａ_１に関する方程式と見なして、

を計算する。 Next, considering equation (41) as an equation for a ₁ ,

Calculate

次に、式（４２）をａ_２に関する方程式と見なして、

を計算する。そして、式（６１）、（６２）、（６３）による更新を順番に繰り返し、ａ_０，ａ_１，ａ_２の値がそれぞれ変化しなくなるまで更新を続ける。 Next, considering equation (42) as an equation for a ₂ ,

Calculate Then, the updating by the formulas (61), (62), and (63) is repeated in order, and the updating is continued until the values of a ₀ , a ₁ , and a ₂ no longer change.

入力信号パラメータ更新部６は、フィルタ特性パラメータ更新部５で更新されたａ_０，ａ_１，ａ_２を利用して、式（１６）のΨを計算し、式（４３）に基づいて、入力信号パラメータｕの値を更新する。 The input signal parameter updating unit 6 calculates Ψ of Expression (16) using a ₀ , a ₁ , and a ₂ updated by the filter characteristic parameter updating unit 5, and inputs based on Expression (43). Update the value of the signal parameter u.

残差信号パラメータ更新部７は、式（４３）に基づいて、残差信号パラメータβの値を更新する。 The residual signal parameter update unit 7 updates the value of the residual signal parameter β based on the equation (43).

パラメータ収束判定部８は、信号分離部４で計算された￣ｘ_ｙ，Ｒ_ｙ，Ｒ_εおよび、モデルパラメータ更新部１０によって更新されたモデルパラメータΘ_２＝｛ａ_０，ａ_１，ａ_２，ｕ，β｝を利用して、式（３９）の目的関数の値を計算する。その値が収束していれば、パラメータ出力部９は、モデルパラメータΘ_２＝｛ａ_０，ａ_１，ａ_２，ｕ，β｝を出力し、次のセグメントにおけるモデルパラメータ推定に移行するため、セグメント分割部２の処理動作へ移行する。一方、収束しない場合は、Θ_２´＝Θ_２として信号分離部４の処理動作に移行する。 The parameter convergence determination unit 8 includes ￣ _{x y} , R _y , R _ε calculated by the signal separation unit 4 and the model parameter Θ ₂ = {a ₀ , a ₁ , a ₂ , updated by the model parameter update unit 10. u, β} is used to calculate the value of the objective function of Equation (39). If the value has converged, the parameter output unit 9 outputs the model parameter Θ ₂ = {a ₀ , a ₁ , a ₂ , u, β}, and shifts to model parameter estimation in the next segment. The process proceeds to the processing operation of the segment division unit 2. On the other hand, when it does not converge, it shifts to the processing operation of the signal separation unit 4 as Θ ₂ ′ = Θ ₂ .

＜第３の実施形態＞
次に、本発明の第３の実施形態による信号解析装置の構成を説明する。第３の実施形態は、前述した手法３を用いて信号解析を行う構成である。第３の実施形態における信号解析装置の構成は、図３に示す構成と同様であり、基本周波数抽出部１及びセグメント分割部２の処理動作は、第１の実施形態と同じである。第３の実施形態は、その他の構成の処理動作が異なる。 <Third Embodiment>
Next, the configuration of the signal analyzing apparatus according to the third embodiment of the present invention will be described. The third embodiment is configured to perform signal analysis using the method 3 described above. The configuration of the signal analysis apparatus in the third embodiment is the same as that shown in FIG. 3, and the processing operations of the fundamental frequency extraction unit 1 and the segment division unit 2 are the same as those in the first embodiment. The third embodiment differs in processing operations of other configurations.

パラメータ初期値生成部３は、モデルパラメータΘ_３＝｛ｗ，ｕ，β｝の初期値を決定する。まず、｛Υ^（１），Υ^（２），・・・，Υ^（Ｉ）｝を作成するために、ζは０から２までの間を０．０２刻みで、Ωは０．０５から０．３までの間を０．００５刻みで変化させる。その結果、Ｉ＝３１００となる。ｗ＝｛ｗ_１，ｗ_２，・・・ｗ_Ｉ｝の初期値はすべて１／Ｉに設定する。ｕは、観測信号ｏの要素の中央値を初期値とする。βは、β＝１００を初期値とする。これらはすべて実験的に決定する。また、αは、α＝２に固定する。 The parameter initial value generation unit 3 determines an initial value of the model parameter Θ ₃ = {w, u, β}. First, in order to create {Υ ⁽¹⁾ , Υ ⁽²⁾ ,..., Υ ^(I) }, ζ is from 0 to 2 in increments of 0.02, and Ω is 0.05 to 0. .3 is changed in increments of 0.005. As a result, I = 3100. The initial values of w = {w ₁ , w ₂ ,... w _I } are all set to 1 / I. The initial value of u is the median value of the elements of the observation signal o. The initial value of β is β = 100. These are all determined experimentally. Α is fixed to α = 2.

信号分離部４は、ガウス過程の定義およびＥＭアルゴリズムに基づいて導出される出力信号と残差信号の期待値を信号分離基準として、観測信号を出力信号と残差信号に分離する。現在のモデルパラメータΘ_３´＝｛ｗ，ｕ，β｝を利用して、式（２８）、（２９）を計算し、式（３０）に基づいて、Ｅ［ｘ｜ｏ；Θ_３´］とＥ［ｘｘ^Ｔ｜ｏ；Θ_３´］を￣ｘ_ｙ，￣ｘ_ε，Ｒ_ｙ，Ｒ_εに分割する。また、式（４８）に基づいて、補助変数γ_３も計算する。 The signal separation unit 4 separates the observation signal into the output signal and the residual signal using the output signal derived based on the definition of the Gaussian process and the expected value of the residual signal based on the EM algorithm as a signal separation reference. Equations (28) and (29) are calculated using the current model parameter Θ ₃ ′ = {w, u, β}, and E [x | o; Θ ₃ ′] is calculated based on equation (30). And E [xx ^T | o; Θ ₃ ′] are divided into ￣ _xy , ￣ _x _ε , R _y , and R _ε . Further, the auxiliary variable γ ₃ is also calculated based on the equation (48).

フィルタ特性パラメータ更新部５は、フィルタ特性パラメータであるｗの値を更新する。Coordinate descent法を利用して、式（４９）のｉ´＝１，２，・・・，Ｉに関する非線形連立方程式を解くと、ｗを求めることができる。まず、初期値として｛ｗ_１，ｗ_２，・・・ｗ_Ｉ｝をすべて０に設定する。そして、式（４９）をｗ_１´に関する方程式と見なして、

と変形する。ｉ´＝１，２，・・・，Ｉに関して、式（６４）による更新を順番に繰り返し、ｗ＝｛ｗ_１，ｗ_２，・・・ｗ_Ｉ｝の値がそれぞれ変化しなくなるまで更新を続ける。 The filter characteristic parameter updating unit 5 updates the value of w that is a filter characteristic parameter. Using the coordinate descent method, w can be obtained by solving the nonlinear simultaneous equations related to i ′ = 1, 2,... First, {w ₁ , w ₂ ,... W _I } are all set to 0 as initial values. And considering equation (49) as an equation for w _{1 ′} ,

And deformed. i'= 1, 2, · · ·, with respect to I, repeatedly updated according to formula (64) in _turn, the update to _{w = {w 1, w 2} , ··· w I} is the value of does not change, respectively to continue.

入力信号パラメータ更新部６は、フィルタ特性パラメータ更新部５で更新されたｗを利用して、式（１６）のΨを計算し、式（５０）に基づいて、入力信号パラメータｕの値を更新する。 The input signal parameter updating unit 6 calculates Ψ of Expression (16) using w updated by the filter characteristic parameter updating unit 5, and updates the value of the input signal parameter u based on Expression (50). To do.

残差信号パラメータ更新部７は、式（５０）に基づいて、残差信号パラメータβの値を更新する。 The residual signal parameter updating unit 7 updates the value of the residual signal parameter β based on Expression (50).

パラメータ収束判定部８は、信号分離部４で計算された￣ｘ_ｙ，Ｒ_ｙ，Ｒ_εおよび、モデルパラメータ更新部１０によって更新されたモデルパラメータΘ_３＝｛ｗ，ｕ，β｝を利用して、式（４４）の目的関数の値を計算する。その値が収束していれば、パラメータ出力部９は、モデルパラメータΘ_３＝｛ｗ，ｕ，β｝を出力し、次のセグメントにおけるモデルパラメータ推定に移行するため、セグメント分割部２の処理動作へ移行する。一方、収束しない場合は、Θ_３´＝Θ_３として信号分離部４の処理動作に移行する。 The parameter convergence determination unit 8 uses ￣x _y , R _y , R _ε calculated by the signal separation unit 4 and the model parameter Θ ₃ = {w, u, β} updated by the model parameter update unit 10. Thus, the value of the objective function of Expression (44) is calculated. If the value has converged, the parameter output unit 9 outputs the model parameter Θ ₃ = {w, u, β} and proceeds to model parameter estimation in the next segment. Migrate to On the other hand, when it does not converge, it shifts to the processing operation of the signal separation unit 4 as Θ ₃ ′ = Θ ₃ .

＜第４の実施形態＞
次に、本発明の第４の実施形態による信号解析装置の構成を説明する。第４の実施形態は、前述した手法４を用いて信号解析を行う構成である。第４の実施形態における信号解析装置の構成は、図３に示す構成と同様であり、基本周波数抽出部１及びセグメント分割部２の処理動作は、第１の実施形態と同じである。第４の実施形態は、その他の構成の処理動作が異なる。 <Fourth Embodiment>
Next, the configuration of the signal analyzing apparatus according to the fourth embodiment of the present invention will be described. The fourth embodiment is configured to perform signal analysis using the method 4 described above. The configuration of the signal analyzing apparatus in the fourth embodiment is the same as that shown in FIG. 3, and the processing operations of the fundamental frequency extracting unit 1 and the segment dividing unit 2 are the same as those in the first embodiment. The fourth embodiment differs in processing operations of other configurations.

パラメータ初期値生成部３は、モデルパラメータをΘ_４＝｛ｗ，ｕ，β｝とする以外は第３の実施形態と同じである。 The parameter initial value generation unit 3 is the same as that of the third embodiment except that the model parameter is Θ ₄ = {w, u, β}.

信号分離部４は、ガウス過程の定義およびＥＭアルゴリズムに基づいて導出される出力信号と残差信号の期待値を信号分離基準として、観測信号を出力信号と残差信号に分離する。現在のモデルパラメータΘ_４´＝｛ｗ，ｕ，β｝を利用して、式（２８）、（２９）を計算し、式（３０）に基づいて、Ｅ［ｘ｜ｏ；Θ_４´］とＥ［ｘｘ^Ｔ｜ｏ；Θ_４´］を￣ｘ_ｙ，￣ｘ_ε，Ｒ_ｙ，Ｒ_εに分割する。また、式（５６）に基づいて、補助変数γ_４も計算する。 The signal separation unit 4 separates the observation signal into the output signal and the residual signal using the output signal derived based on the definition of the Gaussian process and the expected value of the residual signal based on the EM algorithm as a signal separation reference. Using the current model parameter Θ ₄ ′ = {w, u, β}, equations (28) and (29) are calculated, and E [x | o; Θ ₄ ′] is calculated based on equation (30). And E [xx ^T | o; Θ ₄ ′] are divided into ￣ _xy , ￣ _x _ε , R _y , and R _ε . Also, the auxiliary variable γ ₄ is calculated based on the equation (56).

フィルタ特性パラメータ更新部５は、フィルタ特性パラメータであるｗの値を更新する。Coordinate descent法を利用して、式（５７）のｉ´＝１，２，・・・，Ｉに関する非線形連立方程式を解くと、ｗを求めることができる。まず、初期値として｛ｗ_１，ｗ_２，・・・ｗ_Ｉ｝をすべて０に設定する。そして、式（５７）をｗ_１´に関する方程式と見なして、

と変形する。ｉ´＝１，２，・・・，Ｉに関して、式（６５）による更新を順番に繰り返し、ｗ＝｛ｗ_１，ｗ_２，・・・ｗ_Ｉ｝の値がそれぞれ変化しなくなるまで更新を続ける。 The filter characteristic parameter updating unit 5 updates the value of w that is a filter characteristic parameter. Using the coordinate descent method, w can be obtained by solving the nonlinear simultaneous equations related to i ′ = 1, 2,... First, {w ₁ , w ₂ ,... W _I } are all set to 0 as initial values. Then, considering equation (57) as an equation for w _{1 ′} ,

And deformed. i'= 1, 2, · · ·, with respect to I, repeatedly updated according to formula (65) in _turn, the update to _{w = {w 1, w 2} , ··· w I} is the value of does not change, respectively to continue.

入力信号パラメータ更新部６は、フィルタ特性パラメータ更新部５で更新されたｗを利用して、式（１６）のΨを計算し、式（５８）に基づいて、入力信号パラメータｕの値を更新する。 The input signal parameter updating unit 6 calculates Ψ of Expression (16) using w updated by the filter characteristic parameter updating unit 5, and updates the value of the input signal parameter u based on Expression (58). To do.

残差信号パラメータ更新部７は、式（５８）に基づいて、残差信号パラメータβの値を更新する。 The residual signal parameter updating unit 7 updates the value of the residual signal parameter β based on Expression (58).

パラメータ収束判定部８は、前記信号分離部で計算され￣ｘ_ｙ，Ｒ_ｙ，Ｒ_εおよび、モデルパラメータ更新部１０によって更新されたモデルパラメータΘ_４＝｛ｗ，ｕ，β｝を利用して、式（５１）の目的関数の値を計算する。その値が収束していれば、パラメータ出力部９は、モデルパラメータΘ_４＝｛ｗ，ｕ，β｝を出力し、次のセグメントにおけるモデルパラメータ推定に移行するため、セグメント分割部２の処理動作へ移行する。一方、収束しない場合は、Θ_４´＝Θ_４として信号分離部４の処理動作に移行する。 The parameter convergence determination unit 8 uses ￣ _xy , R _y , R _ε calculated by the signal separation unit and the model parameter Θ ₄ = {w, u, β} updated by the model parameter update unit 10. The value of the objective function of equation (51) is calculated. If the value has converged, the parameter output unit 9 outputs the model parameter Θ ₄ = {w, u, β} and proceeds to model parameter estimation in the next segment. Migrate to On the other hand, when it does not converge, it shifts to the processing operation of the signal separation unit 4 as Θ ₄ ′ = Θ ₄ .

＜第５の実施形態＞
次に、図５を参照して、本発明の第５の実施形態による信号解析装置の構成を説明する。第４の実施形態は、前述した手法１〜４を使用して、通常の音声信号（話声、歌声を含む）の信号解析を行う構成である。第５の実施形態における信号解析装置の構成は、図３に示す基本周波数抽出部１に代えてメルケプストラム係数抽出部１１を備えた点が、第１〜４の実施形態と異なる。第５の実施形態における信号解析装置は、音声信号から抽出されるメル周波数ケプストラム係数（ＭＦＣＣ）の時系列を、歌声の基本周波数系列と同様に、階段状の入力信号にフィルタが畳み込まれたものと見なし、観測ＭＦＣＣ信号から、音素列を表す入力信号とフィルタ特性の分離特徴抽出を行う。 <Fifth Embodiment>
Next, the configuration of a signal analyzing apparatus according to the fifth embodiment of the present invention will be described with reference to FIG. The fourth embodiment is configured to perform signal analysis of normal voice signals (including speech and singing voices) using the methods 1 to 4 described above. The configuration of the signal analyzing apparatus in the fifth embodiment differs from the first to fourth embodiments in that a mel cepstrum coefficient extracting unit 11 is provided instead of the fundamental frequency extracting unit 1 shown in FIG. In the signal analysis device according to the fifth embodiment, a time series of mel frequency cepstrum coefficients (MFCC) extracted from an audio signal is convoluted with a step-like input signal in the same manner as a basic frequency sequence of singing voices. It is assumed that the input signal representing the phoneme string and the filter characteristic are separated from the observed MFCC signal.

メルケプストラム係数抽出部１１は、音声信号を周波数分析し、メルケプストラム係数（通常は１２次元程度のベクトル）の時系列を抽出する。第５の実施形態における信号解析装置は、ＭＦＣＣベクトルの各要素の時系列信号ごとに解析処理を行うものである。 The mel cepstrum coefficient extraction unit 11 performs frequency analysis on the audio signal and extracts a time series of mel cepstrum coefficients (usually a vector of about 12 dimensions). The signal analysis apparatus according to the fifth embodiment performs an analysis process for each time-series signal of each element of the MFCC vector.

セグメント分割部２、パラメータ初期値生成部３、信号分離部４、フィルタ特性パラメータ更新部５、入力信号パラメータ更新部６、残差信号パラメータ更新部７、パラメータ収束判定部８およびパラメータ出力部９の処理動作は、前述した第１〜第４の実施形態のいずれかに基づくものとする。 Segment dividing unit 2, parameter initial value generating unit 3, signal separating unit 4, filter characteristic parameter updating unit 5, input signal parameter updating unit 6, residual signal parameter updating unit 7, parameter convergence determination unit 8 and parameter output unit 9 The processing operation is based on any one of the first to fourth embodiments described above.

＜実験結果＞
次に、本発明の効果および作用を示すため、本発明の実施形態による信号解析装置を用いた実験結果を以下に説明する。ここでは、上記第１の実施形態、第２の実施形態、第４の実施形態および非特許文献４の従来法を実装して、基本周波数系列の入力ステップ信号（音高目標値系列）とフィルタ特性を表すインパルス応答信号（歌唱動的変動成分）への分解性能を評価する。 <Experimental result>
Next, in order to show the effects and operations of the present invention, the results of experiments using the signal analysis apparatus according to the embodiment of the present invention will be described below. Here, the conventional method of the first embodiment, the second embodiment, the fourth embodiment and the non-patent document 4 is implemented, and the input step signal (pitch target value sequence) and filter of the fundamental frequency sequence are implemented. The decomposition performance into an impulse response signal (singing dynamic variation component) representing the characteristic is evaluated.

１つ目の評価実験では、本発明が局所最小化問題を解決できているかを確認する。まず、パラメータｕをランダムに１００個決定し、式（１８）に基づいて、１００個の入力ステップ信号を人工的に作成する。ここでは、Ｎ＝３００、α＝２とした。同様に、ζとΩをそれぞれ１００個ランダムに決定し、式（１２）に基づいて、１００個のインパルス応答信号を人工的に作成する。これらの人工的な信号と式（２２）に基づいて、１００個の観測信号を人工的に作成する。ここでは、β＝１００とした。本実験では、これらの観測信号ごとにモデルパラメータを推定する。 In the first evaluation experiment, it is confirmed whether the present invention can solve the local minimization problem. First, 100 parameters u are randomly determined, and 100 input step signals are artificially created based on Expression (18). Here, N = 300 and α = 2. Similarly, 100 ζ and Ω are respectively determined at random, and 100 impulse response signals are artificially created based on Expression (12). Based on these artificial signals and equation (22), 100 observation signals are artificially created. Here, β = 100. In this experiment, model parameters are estimated for each of these observation signals.

評価尺度として、観測信号ごとに、
（１）推定されたモデルパラメータｕと、観測信号の作成のためにランダムに決定されたｕとの二乗誤差
（２）推定されたモデルパラメータによって構成される系のインパルス応答信号と、観測信号の作成のためにランダムに決定されたζとΩに基づくインパルス応答信号の二乗平均平方根誤差（RootMean Square Error,RMSE）
を計算する。どちらも誤差が小さくなれば、観測信号の入力ステップ信号とインパルス応答信号への分解性能が高い（局所最小化問題を回避できている）と言える。表１はそれぞれの誤差の平均値を示し、最も誤差が小さくなったのは、第４の実施形態よる信号解析装置であった。 As an evaluation scale, for each observed signal,
(1) A square error between the estimated model parameter u and u determined at random for creating the observation signal. (2) An impulse response signal of the system constituted by the estimated model parameter, Root Mean Square Error (RMSE) of impulse response signal based on ζ and Ω determined randomly for creation
Calculate In both cases, if the error is small, it can be said that the resolution performance of the observation signal into the input step signal and the impulse response signal is high (a local minimization problem can be avoided). Table 1 shows the average values of the respective errors, and the signal analysis apparatus according to the fourth embodiment has the smallest error.

２つ目の評価実験では、歌声音響信号から抽出される実際の基本周波数系列を利用してパラメータ学習アルゴリズムの収束性能を評価する。歌声データベースとして、クラシックの声楽家、ポップス歌手、音楽的な訓練を受けていない素人（それぞれ男女１名ずつの計６名）の歌声からなるデータベースを利用した。歌唱者は、伴奏なしで歌唱曲を歌唱した。歌唱曲は、「きらきら星」、「喜びの歌（Ｂｅｅｔｈｏｖｅｎの交響曲第９番第４楽章の歌の部分を岩佐東一郎氏によって作詞されたもの）」である。日本語歌詞による歌唱（２パターンの歌詞）、ハミングによる歌唱を収録した。基本周波数は、前述したＹＩＮを利用して、５ｍｓごとに推定される。なお、Ｈｚで表される周波数ｏ_Ｈｚを、次のようにｃｅｎｔで表される対数スケールの周波数ｏ_ｃｅｎｔに変換する。

In the second evaluation experiment, the convergence performance of the parameter learning algorithm is evaluated using an actual fundamental frequency sequence extracted from the singing voice acoustic signal. As a singing voice database, a database consisting of singing voices of classical vocalists, pop singers, and amateurs who have not been trained in music (one male and one female each) was used. The singer sang the song without accompaniment. The songs are “Kirakira Hoshi” and “Song of Joy” (the song part of Beethoven's 9th 4th movement was written by Toichiro Iwasa). Singing with Japanese lyrics (two patterns of lyrics) and singing with Hamming. The fundamental frequency is estimated every 5 ms using the YIN described above. The frequency o _Hz expressed in Hz is converted into a logarithmic scale frequency o _cent expressed as _cent as follows.

この変換により、半音は１００ｃｅｎｔに相当する。次に、図４に示すように、推定された基本周波数系列を手動でセグメントに分割する。その結果、セグメントの総数は１３２３となった。セグメントごとにモデルパラメータを推定する。評価尺度として、セグメントごとに、観測信号ｏと、推定されたモデルパラメータによって再合成される信号μ（式（２３）を参照）との二乗平均平方根誤差（ＲＭＳＥ）を計算した。そのＲＭＳＥの平均値を表１の右部分に示す。この評価実験においても第４の実施形態よる信号解析装置が最も小さい誤差となった。 By this conversion, a semitone corresponds to 100 cent. Next, as shown in FIG. 4, the estimated fundamental frequency sequence is manually divided into segments. As a result, the total number of segments was 1323. Estimate model parameters for each segment. As an evaluation measure, for each segment, the root mean square error (RMSE) between the observed signal o and the signal μ (see equation (23)) recombined with the estimated model parameters was calculated. The average value of the RMSE is shown in the right part of Table 1. Also in this evaluation experiment, the signal analysis apparatus according to the fourth embodiment has the smallest error.

図６は、観測信号と、第４の実施形態のモデルパラメータによって再合成される信号μを示す。観測信号を見ると、各セグメントには、音高の立ち上がりに関する動特性、音高が安定するときに振動するビブラートのような動特性が複雑に重ね合わさっていることがわかる。本発明は、ＥＭ法を利用して、それぞれを分離する手段を持つため、従来法に比べてパラメータの推定性能が向上した。特に、第４の実施形態の信号解析装置は、様々な振動基底の線形和で２次系のフィルタ特性を表現し、さらにその重みパラメータにスパースな制約を持たせているため、他の実施形態に比べて、観測信号へのオーバーフィッティングの問題を解消し、誤差が最も小さくなっている。 FIG. 6 shows the observed signal and the signal μ recombined with the model parameters of the fourth embodiment. Looking at the observed signal, it can be seen that each segment has a complex overlap of dynamic characteristics related to the rise of the pitch and dynamic characteristics such as vibrato that vibrate when the pitch stabilizes. Since the present invention has means for separating each using the EM method, the parameter estimation performance is improved as compared with the conventional method. In particular, the signal analysis apparatus according to the fourth embodiment expresses the filter characteristics of the second-order system with linear sums of various vibration bases, and further has sparse restrictions on the weight parameters. Compared with, the problem of overfitting to the observed signal is solved and the error is minimized.

図７は、第４の実施形態よる信号解析装置によって推定されたζとΩの歌唱者ごとの平均値を示したものである。各セグメントで推定されるｗの要素の最も大きい値ｗ_ｉに着目し、それに対応するζとΩのすべてのセグメントにわたる平均値を歌唱者ごとに計算した。ζが小さな値であることは、その振動現象がオーバーシュートのような減衰振動であることを意味する。一方で、Ωが小さな値であることは、音高の立ち上がり時間が長いことを意味する。したがって、素人歌唱者は歌唱技術が乏しいため、ζとΩの値が他の歌唱者に比べて、ともに小さい値になった。図８は、第４の実施形態よる信号解析装置によって推定されたｕの歌唱者ごとの頻度分布を示す。声楽家やポップス歌手に関しては、半音（１００ｃｅｎｔ）の整数倍の位置に分布のピークが観測される。一方で、素人歌唱者に関しては、そのピークが不鮮明となっている。これも素人歌唱者は歌唱技術が乏しいため、正しい音階（正確な半音単位）で歌唱することが難しいことを意味する。 FIG. 7 shows the average values for each singer of ζ and Ω estimated by the signal analyzing apparatus according to the fourth embodiment. Paying attention to the largest value w _i of the element of w estimated in each segment, the average value over all segments of ζ and Ω corresponding thereto was calculated for each singer. A small value of ζ means that the vibration phenomenon is a damped vibration such as an overshoot. On the other hand, a small value of Ω means that the pitch rise time is long. Therefore, since amateur singers have poor singing skills, the values of ζ and Ω are both smaller than those of other singers. FIG. 8 shows the frequency distribution of each u singer estimated by the signal analyzing apparatus according to the fourth embodiment. For vocalists and pop singers, a distribution peak is observed at a position that is an integral multiple of a semitone (100 cent). On the other hand, for amateur singers, the peak is unclear. This also means that it is difficult for amateur singers to sing with the correct scale (accurate semitones) because of their poor singing skills.

このように、観測信号は、入力ステップ信号とフィルタ特性を表すインパルス応答信号との畳みこみによって得られる系の出力信号と、残差信号との和で構成されるものとし、基本周波数系列の場合は、音が立ち上がる（音高が変化する）時点を始点終点として、それによって分割されるセグメントごとに信号の生成過程とする。また、観測信号には、様々な動特性が混在するため、ガウス過程（ベイジアンアプローチ）の観点から信号の生成過程をモデル化し、観測信号の、出力信号と残差信号への分離過程を適用する。そして、分離された出力信号に基づいて、入力信号パラメータやフィルタ特性パラメータを推定する。一方で、分離された残差信号に基づいて、残差信号パラメータを推定する。この推定の流れを何度も繰り返すことにより、最終的にモデルパラメータを精度よく推定することができる。 Thus, the observed signal is composed of the sum of the output signal of the system obtained by convolution of the input step signal and the impulse response signal representing the filter characteristics, and the residual signal. Is a process of generating a signal for each segment divided by the time when the sound rises (the pitch changes) as the start point and the end point. In addition, since various dynamic characteristics exist in the observed signal, the signal generation process is modeled from the viewpoint of a Gaussian process (Bayesian approach), and the process of separating the observed signal into an output signal and a residual signal is applied. . Then, the input signal parameter and the filter characteristic parameter are estimated based on the separated output signal. On the other hand, a residual signal parameter is estimated based on the separated residual signal. By repeating this estimation flow many times, the model parameter can be estimated with accuracy.

以上説明したように、入力信号をステップ信号と仮定し、また、系のフィルタ特性が線形２次系に従うことを仮定して、観測される時系列信号から、信号生成系の入力信号およびフィルタ特性（インパルス応答信号）、残差信号を推定するようにしたため、歌声の基本周波数（Ｆ０）系列から、歌唱者が歌おうとする音高目標値系列とビブラートやオーバーシュートのような歌唱動的変動成分の特徴抽出を行うことができる。 As described above, assuming that the input signal is a step signal and assuming that the filter characteristic of the system follows a linear quadratic system, the input signal and filter characteristic of the signal generation system can be obtained from the observed time series signal. (Impulse response signal), because the residual signal is estimated, from the fundamental frequency (F0) sequence of the singing voice, the pitch target value sequence that the singer wants to sing, and singing dynamic fluctuation components such as vibrato and overshoot Feature extraction can be performed.

なお、図３、図５における各処理部の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより信号解析処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータシステム」は、ホームページ提供環境（あるいは表示環境）を備えたＷＷＷシステムも含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（ＲＡＭ）のように、一定時間プログラムを保持しているものも含むものとする。 3 and 5 are recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. Signal analysis processing may be performed. Here, the “computer system” includes an OS and hardware such as peripheral devices. The “computer system” includes a WWW system having a homepage providing environment (or display environment). The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Further, the “computer-readable recording medium” refers to a volatile memory (RAM) in a computer system that becomes a server or a client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line. In addition, those holding programs for a certain period of time are also included.

また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。また、上記プログラムは、前述した機能の一部を実現するためのものであってもよい。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。 The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line. The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, what is called a difference file (difference program) may be sufficient.

時系列信号の動特性特徴を抽出することによって、信号解析を行うことが不可欠な用途に適用できる。 By extracting the dynamic characteristic features of time series signals, it can be applied to applications where signal analysis is indispensable.

１・・・基本周波数抽出部、２・・・セグメント分割部、３・・・パラメータ初期値生成部、４・・・信号分離部、５・・・フィルタ特性パラメータ更新部、６・・・入力信号パラメータ更新部、７・・・残差信号パラメータ更新部、８・・・パラメータ収束判定部、９・・・パラメータ出力部、１０・・・モデルパラメータ更新部、１１・・・メルケプストラム係数抽出部 DESCRIPTION OF SYMBOLS 1 ... Fundamental frequency extraction part, 2 ... Segment division part, 3 ... Parameter initial value generation part, 4 ... Signal separation part, 5 ... Filter characteristic parameter update part, 6 ... Input Signal parameter update unit, 7 ... residual signal parameter update unit, 8 ... parameter convergence determination unit, 9 ... parameter output unit, 10 ... model parameter update unit, 11 ... mel cepstrum coefficient extraction Part

Claims

The observed signal is represented by the sum of the output signal and residual signal of the signal generation system obtained by convolution of the input signal and the impulse response signal representing the filter characteristics, and a model representing the input signal is constructed from the observed signal A dynamic characteristic feature of a time-series signal is extracted by estimating an input signal parameter to be processed, a filter characteristic parameter constituting a model representing the filter characteristic, and a residual signal parameter constituting a model representing the residual signal A signal analyzer,
A parameter initial value generating unit that generates initial values of the input signal parameter, the filter characteristic parameter, and the residual signal parameter from the observed signal;
A set of the filter characteristic parameter, the input signal parameter, and the residual signal parameter is used as a model parameter, and the observation signal is defined by the input signal parameter and the filter characteristic parameter using the model parameter. A signal separation unit for separating the output signal of the generation system and the residual signal configured by the residual signal parameters;
A function obtained by adding the prior probability of the model parameter to the conditional expected value of the log likelihood function when the observed signal, the model parameter, and the set of the output signal and the residual signal are given A model parameter updating unit that updates the model parameter so as to maximize the objective function with respect to the model parameter,
It is determined whether or not the model parameter satisfies a predetermined criterion, and when it is determined that the predetermined parameter is not satisfied, processing by the signal separation unit and the model parameter update unit is performed until the predetermined criterion is satisfied. A parameter convergence determination unit to be performed again;
A parameter output unit that outputs the model parameter when the parameter convergence determination unit determines that the model parameter satisfies a predetermined criterion;
A signal analyzing apparatus comprising:

The input signal is a step signal;
The output signal is stochastically modeled as following a multidimensional Gaussian distribution;
The signal analysis apparatus according to claim 1, wherein the residual signal is probabilistically modeled as Gaussian white noise.

The filter characteristic of the signal generation system is represented by a filter derived by a difference method,
The signal analysis apparatus according to claim 2, wherein the filter characteristic parameters are a parameter that is inversely proportional to the square of the natural frequency and a parameter that is proportional to the attenuation factor and inversely proportional to the natural frequency.

The signal analysis apparatus according to claim 2, wherein the filter characteristic of the signal generation system is represented by a filter configured based on an autoregressive process, and the filter characteristic parameter is an autoregressive parameter.

The filter characteristic of the signal generation system is represented by a filter constituted by a weighted linear sum of a plurality of secondary system filters, and the filter characteristic parameter is a weight of each secondary system filter. The signal analysis apparatus according to claim 2.

The model parameter update unit
Equations obtained by differentiating the auxiliary function of the objective function composed of auxiliary variables with a parameter inversely proportional to the square of the natural frequency and a parameter proportional to the attenuation rate and inversely proportional to the natural frequency. A filter characteristic parameter updating unit that updates the value of the filter characteristic parameter by solving the simultaneous equation consisting of
An input signal parameter updating unit for updating the input signal parameter by solving an equation obtained by differentiating the auxiliary function with the input signal parameter;
4. The signal analysis according to claim 3, further comprising a residual signal parameter for updating the residual signal parameter by solving an equation obtained by differentiating the auxiliary function with the residual signal parameter. apparatus.

The model parameter update unit
A filter characteristic parameter updating unit that updates the value of the filter characteristic parameter by solving for the filter characteristic parameter a simultaneous equation consisting of equations obtained by differentiating the objective function with each autoregressive parameter included in the filter characteristic parameter. When,
An input signal parameter update unit for updating the input signal parameter by solving an equation obtained by differentiating the objective function with the input signal parameter;
5. The signal analysis according to claim 4, further comprising a residual signal parameter for updating the residual signal parameter by solving an equation obtained by differentiating the objective function with the residual signal parameter. apparatus.

The model parameter update unit
By solving a nonlinear simultaneous equation consisting of equations obtained by differentiating the auxiliary function of the objective function composed of auxiliary variables by the weights of the respective second-order filters, which are the filter characteristic parameters, with respect to the filter characteristic parameters, A filter characteristic parameter update unit for updating the value of the filter characteristic parameter;
An input signal parameter updating unit for updating the input signal parameter by solving an equation obtained by differentiating the auxiliary function with the input signal parameter;
6. The signal analysis according to claim 5, further comprising: a residual signal parameter for updating the residual signal parameter by solving an equation obtained by differentiating the auxiliary function with the residual signal parameter. apparatus.

The signal separation unit uses the expected value of complete data composed of the output signal and the residual signal when the observation signal and the model parameter are given, and the autocorrelation of the complete data, 9. The signal analyzing apparatus according to claim 2, wherein the observation signal is separated into an output signal and a residual signal.

The observed signal is represented by the sum of the output signal and residual signal of the signal generation system obtained by convolution of the input signal and the impulse response signal representing the filter characteristics, and a model representing the input signal is constructed from the observed signal A dynamic characteristic feature of a time-series signal is extracted by estimating an input signal parameter to be processed, a filter characteristic parameter constituting a model representing the filter characteristic, and a residual signal parameter constituting a model representing the residual signal A signal analysis method in a signal analyzer,
A parameter initial value generating step of generating initial values of the input signal parameter, the filter characteristic parameter, and the residual signal parameter from the observed signal;
A set of the filter characteristic parameter, the input signal parameter, and the residual signal parameter is used as a model parameter, and the observation signal is defined by the input signal parameter and the filter characteristic parameter using the model parameter. A signal separation step of separating the output signal of the generation system and the residual signal constituted by the residual signal parameters;
A function obtained by adding the prior probability of the model parameter to the conditional expected value of the log likelihood function when the observed signal, the model parameter, and the set of the output signal and the residual signal are given A model parameter update step for updating the model parameter so as to maximize the objective function with respect to the model parameter,
It is determined whether or not the model parameter satisfies a predetermined criterion, and when it is determined that the model parameter does not satisfy the predetermined criterion, processing by the signal separation step and the model parameter update step is performed until the predetermined criterion is satisfied. A parameter convergence determination step to be performed again;
A parameter output step for outputting the model parameter when the model parameter is determined to satisfy a predetermined criterion by the parameter convergence determination step;
A signal analysis method characterized by comprising:

The observed signal is represented by the sum of the output signal and residual signal of the signal generation system obtained by convolution of the input signal and the impulse response signal representing the filter characteristics, and a model representing the input signal is constructed from the observed signal A dynamic characteristic feature of a time-series signal is extracted by estimating an input signal parameter to be processed, a filter characteristic parameter constituting a model representing the filter characteristic, and a residual signal parameter constituting a model representing the residual signal A signal analysis program for causing a computer on a signal analyzer to perform signal analysis,
A parameter initial value generating step of generating initial values of the input signal parameter, the filter characteristic parameter, and the residual signal parameter from the observed signal;
A set of the filter characteristic parameter, the input signal parameter, and the residual signal parameter is used as a model parameter, and the observation signal is defined by the input signal parameter and the filter characteristic parameter using the model parameter. A signal separation step of separating the output signal of the generation system and the residual signal constituted by the residual signal parameters;
A function obtained by adding the prior probability of the model parameter to the conditional expected value of the log likelihood function when the observed signal, the model parameter, and the set of the output signal and the residual signal are given A model parameter update step for updating the model parameter so as to maximize the objective function with respect to the model parameter,
It is determined whether or not the model parameter satisfies a predetermined criterion, and when it is determined that the model parameter does not satisfy the predetermined criterion, processing by the signal separation step and the model parameter update step is performed until the predetermined criterion is satisfied. A parameter convergence determination step to be performed again;
A parameter output step for outputting the model parameter when the model parameter is determined to satisfy a predetermined criterion by the parameter convergence determination step;
A signal analysis program for causing the computer to execute