WO1998008167A1 - Signal processing method using a finite-dimensional filter - Google Patents

Signal processing method using a finite-dimensional filter Download PDF

Info

Publication number
WO1998008167A1
WO1998008167A1 PCT/AU1997/000519 AU9700519W WO9808167A1 WO 1998008167 A1 WO1998008167 A1 WO 1998008167A1 AU 9700519 W AU9700519 W AU 9700519W WO 9808167 A1 WO9808167 A1 WO 9808167A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
finite
signal
state
filter
Prior art date
Application number
PCT/AU1997/000519
Other languages
French (fr)
Inventor
Robert James Elliott
Vikram Krishnamurthy
Original Assignee
University Of Alberta
The University Of Melbourne
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Alberta, The University Of Melbourne filed Critical University Of Alberta
Priority to AU37627/97A priority Critical patent/AU3762797A/en
Publication of WO1998008167A1 publication Critical patent/WO1998008167A1/en

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03HIMPEDANCE NETWORKS, e.g. RESONANT CIRCUITS; RESONATORS
    • H03H21/00Adaptive networks
    • H03H21/0012Digital adaptive filters
    • H03H21/0025Particular filtering methods
    • H03H21/0029Particular filtering methods based on statistics

Definitions

  • the present invention relates to a signal processing method and a finite- dimensional filter for use in deriving parameters of a linear system.
  • a number of signal processing procedures involve attempts to determine a system model which can be used to define and predict a received signal of interest.
  • a received signal or observed data may include additional components, such as Gaussian noise, which need to be taken into account when forming the system model.
  • Signal processing filters are used to process the received data and provide estimates which can be used to derive the parameters of the system. Processing of the received data is executed on the basis of an iterative process to determine the parameters and a filter may be used to execute part of the process. Once the parameters have been determined or converged to an acceptable value after executing the iterative process, the system model can be used to determine the signal of interest with the additional components removed.
  • ML parameter estimation for linear Gaussian models and other related time series models using the EM algorithm was studied in the 1980s, as discussed in R.H. Shumway and D.S. Stoffer, "An Approach to Time Series Smoothing and Forecasting using the EM Algorithm” , J. Time Series Analysis, Vol. 3, No. 4, pp. 253-264, 1982 and D.
  • Each iteration of the process comprises two steps, the expectation (E-step) and the maximisation (M-step).
  • the E-step for linear Gaussian models involves determining the following two conditional expectations based on all of the observed or received data: (i) The sum over time of the state of the signal of interest, (ii) The sum over time of the covariance of the state of the signal of interest.
  • the maximisation step can be used to determine the ML estimates of the parameters of a system model for the signal of interest.
  • a signal processing method for determining parameters of a linear Gaussian system to remove a Gaussian noise component comprising: processing received data by a finite-dimensional filter to obtain expectation data; processing said expectation data to derive parameter data representative of said parameters.
  • the present invention also provides a finite-dimensional filter for processing received data to obtain state data for deriving parameter data of a linear Gaussian system associated with said received data.
  • Figure 1 is a block diagram of a preferred embodiment of a system for executing an expectation and maximisation process
  • Figure 2 is a block diagram of a finite dimensional filter for the expectation and maximisation process
  • Figure 3 is a block diagram of another preferred embodiment of a system for executing an expectation and maximisation process
  • Figure 4 is a block diagram of a two sensor speech enhancement system
  • Figure 5 is a graph of parameter estimates against number of data passes
  • Figure 6 is a block diagram of a system for localisation of narrowband sources
  • Figure 7 is a block diagram of a chemical process plant
  • Figure 8 is a graph of output data obtained from the plant against time
  • Figure 9 is a graph of parameter estimates for a model of a subsystem of the plant versus iterations of an expectation and maximisation process.
  • a linear Gaussian system model can be applied to time signals and time series data, such as uncoded speech, which includes a random component that can be considered to vary in accordance with a Gaussian distribution.
  • x k e R m and x excreted R m is a Gaussian random variable with zero mean and covariance matrix B 2 0 e R mx,n .
  • Equation 1 the noise in equation 1 is modelled by an independent Gaussian random variable with zero mean and covariance matrix B 2 k+ 1 . It is known that such a Gaussian random variable can be represented as B k+ 1 w k+ 1 where w k+ 1 is an m-vector of N(0, 1) random variables.
  • the state x k may be observed indirectly via the vector observations y k , where
  • y k e R d and v k is a vector of independent N(0, 1) random variables.
  • D k is a symmetric non-singular d x d matrix.
  • the vector observations y k represent the received data or signal and constitute the information which is available on the state of the signal of interest x k .
  • the variable v k and the parameter D k represent that noise component which needs to be determined so that the state x k can be determined.
  • a filter is used to derive the sum of the state data and the sum of the covariance of the state data. If we let e t , ⁇ j denote the unit vectors in R m with 1 in the ith and jth position respectively, and let e n be the unit vector in R d with 1 in the nth position, for i e ⁇ 1 , , m ⁇ and n e ⁇ 1 , , d ⁇ , we can define the scalar processes
  • the H k represent covariance sums and the J k represents the sum of the state multiplied by the observed data y k and is hereinafter referred to as the state sum.
  • the filter derives the conditional expectations
  • the data produced by the filter is then used in the maximisation step of the EM process to derive or estimate the parameters A k , B k , C k and D k of the system, from which the state of the signal of interest x k can be determined.
  • the expectations are related to densities of the distribution of the state and can be determined from those densities.
  • the accompanying Appendix A describes the relationship between the densities and the required expectations, and explains the derivation of filter equations for the expectations.
  • the filter for determining the expectations is configured to process the observed data and other data as described by the equations.
  • the densities ⁇ k , ⁇ (0) k , ⁇ (1) k , ⁇ ⁇ 2) k and ⁇ k are defined by equation 11 in Section 4.1 of Appendix A. They can be determined by the integrals specified in equations 13 to 17, yet the integrals are not closed and normally require a fixed interval smoothing process, such as that executed by the Kalman smoother with all its inherent disadvantages to provide estimates of the densities.
  • Theorems 5.1, 5.2 and 5.3 however specify filter equations for the densities, and Theorem 5 4 uses the relationship between the densities and the expectations to derive finite-dimensional filter equations 47 and 48 for the expectations
  • the coefficients of the filter equations a u(M) k , b IJ ⁇ M) k and d' J(M) k can be determined from equations 71 to 79 using the observed data and the data obtained trom a Kalman filter, being ⁇ k , R k and R k
  • the filter equations 47, 48 and 71 to 79 each stipulate the configuration of a respective finite-dimensional filter which together form a finite- dimensional filter 14 to generate the expectation data
  • the expectation data is then used in the EM process to obtain the parameters A, C,
  • the EM process executed using the new filters, as shown in Figure 1 involves the observed data y k 10 being applied to a Kalman filter 12 and the finite-dimensional recursive filter 14.
  • the Kalman filter 12 processes the observed data y k 10 to generate conditional mean data and conditional covariance data of the state, ⁇ k , and R k .
  • the finite-dimensional filter 14 includes first, second and third coefficient modules 18, 20 and 22, and first, second and third delay units 24, 26 and 28 for the coefficient modules, respectively.
  • the coefficient modules 18 to 22 receive the conditional mean and conditional covariance data of the state, ⁇ and R k , from the Kalman filter 12.
  • the first coefficient module 18 generates and ou ⁇ uts updates of the coefficients a ⁇ M) k+ 1 according to 5 equation 71 based on the data received from the Kalman filter 12 and the coefficients a ⁇ M> k , b (M) k , d (M) k outputted from the delay units 24, 26 and 28, respectively.
  • the second coefficient module 20 generates and ou ⁇ uts the coefficients b (M) k+ ⁇ according to equation
  • coefficient module 22 generates and ou ⁇ uts the coefficients d (M) + ⁇ according to equations
  • the coefficients a (M) k+ I , b (M, k+ ⁇ and d (M) k+ I generated by the modules 18 to 22 are provided to a filter covariance and sum module 30 which generates the covariance and state sum expectations according to equations 47 and
  • the filters of the filter 14 are finite-dimensional because of the following two properties that hold at each time instant:
  • the filtered density of the current time sum of the state ⁇ k is given by an 0 affine function in x times the filtered state density ⁇ k as per equation 44.
  • the filtered state density is a Gaussian in x with mean and covariance given by the Kalman filter equations 24 and 25.
  • the filtered density of the current time sum of the state covariance ⁇ (M) k is a quadratic in x times the filtered state estimate ⁇ k per equation 28. 5
  • the filtered density of the state sum is given in terms of four sufficient statistics, namely the two coefficients of the affine function in x and the Kalman mean ⁇ k and covariance R k . Similarly the filtered density of the covariance sum is given by five sufficient statistics.
  • the above "closure” property also holds for higher order statistics as well.
  • the filtered density of the current time sum of the p-th order statistic of the state is a p-th order polynomial in x times the filtered state estimate.
  • finite-dimensional filters can be derived for the time sum of pth order statistics of the state.
  • For the filtered E-step only the filter for the first and second orders is used.
  • the EM process and the filters used can be implemented by an application specific signal processing circuit, but normally would be implemented using one or more microprocessors and a number of iterative software routines to process the observed data or signal.
  • the EM process based on the use of the filters 12 and 14 has significant advantages over the EM process which uses an integral smoother, such as the Kalman smoother.
  • the filter-based EM process provides significant savings in memory requirements, as it eliminates the need for a backward iteration and only summations need to be obtained, therefore removing the requirement for estimates to be maintained for each time interval for successive iterations.
  • the filter based process is also twice as fast as the previous smoother based EM process as no forward and backward iteration method is required. Parameter estimates for the process also converge faster, which is particularly advantageous when processing observed data as it is received.
  • the finite-dimensional filters of the filter 14 are also decoupled, in that one does not depend on the other, and therefore can be implemented in parallel on a multiprocessor system to further enhance the speed of the process, as discussed in Section 9 of Appendix A.
  • a recursive version of the EM process can be implemented as shown in Figure 3.
  • the Kalman filter 12 and the finite- dimensional filter 14 are used to compute the required statistical data, and the model parameter estimates are updated by an M-step process module 32.
  • the resulting process is recursive in that the parameters are updated by the M-step module 32 at each time instant, rather than the end of the pass of the available data points as in an off-line case.
  • SUBST ⁇ UTE SHEET (Rule 26) time joint parameter and state estimation can be used to track time varying parameters in real time multi-sensor signal enhancement, and high resolution localisation of narrowband sources, as discussed hereinafter.
  • Other applications of the filter-based EM process include linear predictive coding of speech and estimation of dynamic shock error models as discussed hereinafter.
  • a time domain approach can be adopted, whereby the process jointly estimates the signal, the noise, the coupling systems and the unknown signal and noise spectral parameters.
  • the advantage of this time domain approach is that many of the computational and conceptual difficulties associated with the frequency domain approach are avoided.
  • An example of the time domain approach for an observed signal s k where v k represents white Gaussian noise is described on pages 31 and 32 of Appendix A.
  • a two sensor speech enhancement system 32 uses two noisy sensors (microphones) 34 and 36 to receive a speech signal in ambient noise.
  • the noise received at the two sensors 34 and 36 affect the received speech signals and are correlated with different correlations as the sensors 34 and 36 are in ditferent locations.
  • the aim is to optimally combine the two sensor ou ⁇ uts to extract optimal estimates of the speech signal 0 (state) and speech and noise parameters This is achieved by using the filter-based EM process in a module 38 which receives the digitised outputs of the sensors 34 and 36
  • Source code in Matlab for the module 38 is given in Appendix B.
  • the signal source was assumed to be an AR(1) process with AR coefficient a !
  • the smoother based EM process requires of the order of N 2 T memory (wherein N is the order of the AR process and T is the number of data points) whereas the filter based process requires no storage memory apart from local variables. Also the various filters are decoupled and hence can be implemented in parallel.
  • a systolic 5 implementation is described in Section 9 of Appendix A.
  • the state equation relates to the narrowband sources.
  • ⁇ ⁇ 1) are parameters that determine the bandwidth of the sources.
  • ⁇ (2) denotes the angles of arrival to the multi-sensor array.
  • C( ⁇ (2) ) is called the steering matrix of the array towards direction ⁇ (2) .
  • a Kalman smoother is used to obtain maximum likelihood estimates of the parameters ⁇ (1) , ⁇ (2) and noise variances B and D.
  • the filter-based EM process can be used instead of the Kalman smoother by using a filter-based EM process module 40 to receive the signal data observed by a sensor array 42 in response to the signals generated by a plurality of narrowband sources 44, as shown in Figure 6.
  • the module 40 is able to provide estimates of direction arrival ⁇ ⁇ 2) , bandwidth ⁇ (,) and source signal x k+ I for each narrowband source 44.
  • a plant 50 may include a three-tank level and temperature system, as shown in Figure 7, which is versatile and representative of typical chemical process plants.
  • the plant 50 has two-cylindrical glass tanks 52 and 54 and one conical glass tank 56, each having an internal diameter of 16 cm, and height of 50 cm.
  • the system is equipped with several transducers: three differential pressure cell transducers 58, 60 and 62 for measuring the flow rates of steam and water; and five thermocouples 64, 66, 68, 70 and 72 for measuring the temperature of the water at various points in the system.
  • control valves 74, 76 and 78 for manipulating the flow rates of water and steam into the tanks 52, 54 and 56, and three tank level sensors 80, 82 and 84 for the tanks 52, 56 and 58. Further details of the plant are given in O.O. Badmus, D. Grant Fisher and Sirish L. Shah, "Real-time, Sensor-based Computing in the Laboratory” , Chemical Engineering Education, Fall 1996, pages 280 to 289.
  • the process equipment of the plant 2 can readily be configured to demonstrate the performance of different process systems and control schemes. Only a subsystem of the plant 2, being the bold portion of Figure 7 for the first tank 52, is utilised for the description below.
  • a uniformly distributed random input signal of maximum amplitude 1 (0 ⁇ M ⁇ 1) was generated and applied every 5 seconds as a perturbation of the nominal value of the position of the cold water valve 74 (set at 21 % open) with an amplification gain of 10.
  • the input signal is bounded by the interval 16% to 26% open.
  • a noisy Gaussian AR series adequately describes the process dynamics when the input signal is assumed to be unknown and modelled as Gaussian white noise.
  • Figure 8 shows a 2000 point ou ⁇ ut data segment obtained from the level sensor 80 for the first tank 52 when the input signal is white Gaussian noise with variance 8.2687.
  • the ou ⁇ ut series was detrended by subtracting the mean, resulting in a zero-mean time series.
  • the filter-based EM process was executed on the zero-mean data.
  • Initial AR parameter estimates were arbitrarily chosen as 0.4 and 0.0.
  • the parameter estimates obtained from the filter-based EM process are plotted in Figure 9. The estimates converge after about 10 iterations. After 20 iterations the estimates are 0.5970 and 0.4027.
  • the parameter estimates, after only 6 iterations, are very close to the final values which stabilise after 10 iterations. This rapid convergence, combined with the fact that no memory storage of previous values is required, demonstrates the advantages of the filter- based EM.
  • unvoiced speech is modelled as an auto- regressive process driven by white Gaussian noise.
  • the auto-regressive parameters correspond to those of the vocal tract resonant cavities and change with the flow of speech according to movement of the mouth, tongue and chest. These changes are relatively slow compared to the speech signal variations.
  • the speech signal itself is embedded in white Gaussian noise.
  • the aim is to obtain ML estimates of the AR parameters, which can be done using the filter-based EM process.
  • the filter-based EM process can also be used as a substitute for the Kalman smoother based EM algorithm described in D. Ghosh, "Maximum Likelihood Estimation of the Dynamic Shock-Error Model" , Journal of Econometrics, Vol. 41 , No. 1, pp. 121- 143, May 1989.
  • the EM process is used to attain ML estimates of dynamic shock error model parameters for economies, such as the shadow economy in the United States.
  • the filter-based EM process can also be used to determine the parameters for a system to predict the spot price of oil, as described in detail in accompanying Appendix C.
  • the filter-based EM process has been described above in relation to discrete time systems but it can also be applied to continuous time linear Gaussian systems.
  • the EM algorithm is a general iterative numerical algorithm for obtaining ML estimates. Each iteration consists of two steps The Expectation (E- step) and the Maximization (M-step).
  • E- step The Expectation for linear Gaussian models involves computing the following two conditional expectations based on all the observations (i) The sum over time of the state, (ii) The sum over time of the state covariance.
  • the E-step is non-causal involving fixed-interval smoothing via a Kalman smoother ( I e a forward pass and a backward pass)
  • the filtered E-step has the following advantages.
  • HMM filters are finite dimensional because of the idempotent property of the state indicator lunction of a finite state Markov chain In linear Gaussian models unlike the HM case the state indicator vector is no longer idempotent Instead our filters are finite dimensional because of the following two remarkable algebraic properties that hold at each time instant
  • the filtered density of the current time sum of the state is given by an affine function in x times the filtered state density
  • the filtered state density is a Gaussian in x with mean and variance given by the Kalman filter equations
  • ⁇ k € R m and ⁇ 0 € K m is a Gaussian random variable with zero mean and covariance matrix B 2 €l m m
  • ⁇ , " , P we are given two sequences of independent, identically distributed random variables n € E m , y t € R d .
  • the Xk are a sequence of independent m-dimensional N(0,I m ) random variables
  • the i / jt are a sequence of independent d-dimensional N(0,I d ) random variables.
  • 7 m (resp Ij) represents the m x m (resp. d x d) identity matrix.
  • vi and w ⁇ are sequences of independent /V(0,/ d ) and N(0,I m ) random variables, respectively
  • PROOF Suppose / . R d — R and g . !R m — R are arbitrary measurable 'test' functions Then with E (resp E) denoting expectation under P (resp P)
  • the following theorem gives recursive expressions for the k , k and ⁇ ;k .
  • the recursions are derived under the measure P where ⁇ x, ⁇ and ⁇ j / ; ⁇ , / € 2Z + are independent sequences of random variables.
  • finite dimensional filters are derived for H[ ] ⁇ ⁇ 1 - U 1 2 and ⁇ [" defined in (10)
  • a k is an un-normahzed Gaussian density with mean in and xariancc R given via the standard Kalman filter equations
  • R k [(A k R k ⁇ A' k + Blr l - C k ' (D k D[f l ⁇ ] ⁇ (25)
  • Theorem 5.3 The density ⁇ f[ n ⁇ x) is completely determined by the J, statistics , b ⁇ " , ⁇ k £ K m and R k £E mxm as follows
  • Theorem 54 gives finite dimensional filters for the time sum of the states J[ n and time sum of the square of the states H[ 3
  • finite dimensional filters exist for the time sum of any arbitrary integral power of the states Assumption 6.1 I or notatonal simplicity, in this section ⁇ assume that the state and obsenation processes are scalar lalued i c m — d — 1 I (1) and (2)
  • the density ⁇ k ⁇ x) i (52) is completely defined by the p + 3 statistics ⁇ t (0) a ( ⁇ ) , ⁇ j t (p), R k and ⁇ k as follows p
  • Our first step is to provide a sufficient condition for R k ⁇ k - ⁇ to De non-singular
  • Lemma 7.2 // the dynamical system (1), (2) is uniformly completely controllable and R Q > 0 then R k ⁇ and R k ⁇ k - ⁇ are positive definite matrices (and hence nonsmgular) for all k > N ⁇
  • conditional estimates of x[ , x k x k ', and J k n (x c ) converge to the conditional estimates of x k , x k x' k , and J k ' n (x) respectively
  • the aim of this section is to derive a filter based EM algorithm for computing maximum-likelihood parameter estimates of a linear Gaussian state space system
  • the finite dimensional filters of Section 7 are used in implementing the E-step resulting in a filter-based EM algorithm
  • be a family of probability measures on ( ⁇ , F) all absolutely continuous with respect to a fixed probability measure PQ
  • PQ probability measure
  • Step 2 f M-step) Find ⁇ ] + l £ argmaxQ( ;)
  • Theorem 1 in [12] implies that the sequence of model estimates ⁇ 0 ⁇ , j £ X + from the EM algorithm are such that the sequence of likelihoods ⁇ £(#,) ⁇ , j £ 7i is monotonically increasing It is straightforward to compactify the Euclidean parameter space ⁇ Since C( ⁇ ) is continuous on this compact space it is bounded from above Therefore the sequence ⁇ £(#,) ⁇ converges to some £ * Since the function Q(#, ; ) is continuous both in ⁇ and ⁇ ⁇ , bv Theorem 2 in [13], £" ⁇ ⁇ ⁇ stationary value of £ To make sure that C is a maximum value of the likelihood, it is necessarv to try different initial values ⁇ Q Weak consistency of the maximum likelihood estimate is proved m [2] for this particular model (also see Chapter 4 in [16]) Similar parametrized models are used [1] and [6] and can be estimated ia our filter-based EM algorithm
  • the Kalman smoot her-based EM scheme in [2] , [3] requires 0( m 3 T) memorv since all the Kalman filter covariance matrices R k . k ⁇ T need to be stored before smoot hed cov ariance matrices can be computed, see Eq.(2.12) in [2] This also involves significant memory read-write overhead costs.
  • the forward backward algorithm requires significant memory read-write overhead costs requiring ⁇ r T memory locations to be accessed for the stored forward variables while computing the backward variables
  • Rk BB+A*R(k-ll *A'
  • %% est denotes the parameter estimates
  • the model in summary has dynamics (4) for the "signal" (x,, ⁇ t )
  • Equations (7) and (8) are of the form where the Kalman filter can be applied. This considers linear dynamics for the signal
  • ⁇ k E[x k
  • R k E[(x k - ⁇ k )(x k - ⁇ k )'
  • the filter-based EM process using the finite-dimensional filter, can advantageously be used to estimate these parameters, as set out below.

Landscapes

  • Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Complex Calculations (AREA)

Abstract

A signal processing method for determining parameters of a linear Gaussian system to remove a Gaussian noise component, comprising processing received data by a finite-dimensional filter to obtain expectation data, processing the expectation data to derive parameter data representative of the parameters. A finite-dimensional filter for processing received data to obtain state data for deriving parameter data of a linear Gaussian system associated with the received data.

Description

SIGNAL PROCESSING METHOD USING A FTNITE-DIMENSIONAL FILTER
The present invention relates to a signal processing method and a finite- dimensional filter for use in deriving parameters of a linear system.
A number of signal processing procedures involve attempts to determine a system model which can be used to define and predict a received signal of interest. A received signal or observed data may include additional components, such as Gaussian noise, which need to be taken into account when forming the system model. Signal processing filters are used to process the received data and provide estimates which can be used to derive the parameters of the system. Processing of the received data is executed on the basis of an iterative process to determine the parameters and a filter may be used to execute part of the process. Once the parameters have been determined or converged to an acceptable value after executing the iterative process, the system model can be used to determine the signal of interest with the additional components removed. This is applicable to a number of signal processing applications, including multi-sensor speech enhancement, high resolution source localisation in multi-sensor signal processing and linear predictive coding of speech. The process can also be applied to dynamic shock error models used in economic forecasting. One iterative process, known as the expectation maximisation (EM) algorithm, is used to obtain maximum likelihood (ML) estimates of system parameters. ML parameter estimation for linear Gaussian models and other related time series models using the EM algorithm was studied in the 1980s, as discussed in R.H. Shumway and D.S. Stoffer, "An Approach to Time Series Smoothing and Forecasting using the EM Algorithm" , J. Time Series Analysis, Vol. 3, No. 4, pp. 253-264, 1982 and D. Ghosh, " Maximum Likelihood Estimation of the Dynamic Shock-Error Model" , Journal of Econometrics, Vol. 41 , No. 1 , pp. 121-143, May 1989, and developed more recently for signal processing applications, as discussed in E. Weinstein, A.V. Oppenheim, M. Feder and J. R. Buck, "Iterative and Sequential Algorithms for Multisensor Signal Enhancement" , IEEE Signal Proc , Vol. 42, No. 4, pp. 846-859, April 1994 and V. Krishnamurthy, "On-line Estimation of Dynamic Shock-Error Models" , IEEE Trans. Auto. Control, Vol. 35, No. 5, pp. 1 129-1 134, 1994. Each iteration of the process comprises two steps, the expectation (E-step) and the maximisation (M-step). The E-step for linear Gaussian models involves determining the following two conditional expectations based on all of the observed or received data: (i) The sum over time of the state of the signal of interest, (ii) The sum over time of the covariance of the state of the signal of interest.
From these two expectations or estimations, the maximisation step can be used to determine the ML estimates of the parameters of a system model for the signal of interest.
Determination of the expectations in the E-step in the past has involved, for linear Gaussian system models, approximating the sums over the signal distribution by performing fixed interval smoothing of an integral. Implemented in this manner the E-step is non-causal. To perform the fixed interval smoothing a Kalman smoother has been used which requires processing the time series data obtained in one time direction and then also processing the data in the reverse direction. This is referred to as performing both forward and backward passes over the time series data and during each iteration pass, the state data obtained has to be stored for use in the next iteration until the E-step is completed. This significantly affects the memory requirements for executing the E-step, and the necessity to perform both a forward and backward pass using all of the time series data obtained reduces the speed of the process.
In accordance with the present invention there is provided a signal processing method for determining parameters of a linear Gaussian system to remove a Gaussian noise component, comprising: processing received data by a finite-dimensional filter to obtain expectation data; processing said expectation data to derive parameter data representative of said parameters.
The present invention also provides a finite-dimensional filter for processing received data to obtain state data for deriving parameter data of a linear Gaussian system associated with said received data.
Preferred embodiments of the present invention are hereinafter described, by way of example only, with reference to the accompanying drawings, wherein:
Figure 1 is a block diagram of a preferred embodiment of a system for executing an expectation and maximisation process;
Figure 2 is a block diagram of a finite dimensional filter for the expectation and maximisation process;
Figure 3 is a block diagram of another preferred embodiment of a system for executing an expectation and maximisation process;
Figure 4 is a block diagram of a two sensor speech enhancement system; Figure 5 is a graph of parameter estimates against number of data passes; Figure 6 is a block diagram of a system for localisation of narrowband sources;
Figure 7 is a block diagram of a chemical process plant; Figure 8 is a graph of output data obtained from the plant against time; and Figure 9 is a graph of parameter estimates for a model of a subsystem of the plant versus iterations of an expectation and maximisation process. A linear Gaussian system model can be applied to time signals and time series data, such as uncoded speech, which includes a random component that can be considered to vary in accordance with a Gaussian distribution. The system model for such a signal xk+ 1 for k = 0, 1 , is given by the dynamics:
Xk = A-.Λ + 5JMW*H (1)
Here xk e Rm and x„ e Rm is a Gaussian random variable with zero mean and covariance matrix B2 0 e Rmx,n.
At time k + 1 , k = 0, 1 , , the noise in equation 1 is modelled by an independent Gaussian random variable with zero mean and covariance matrix B2 k+ 1. It is known that such a Gaussian random variable can be represented as Bk+ 1 wk+ 1 where wk+ 1 is an m-vector of N(0, 1) random variables.
For k = 0, 1, , the state xk may be observed indirectly via the vector observations yk, where
yk = Λ + Dkv k (2)
Here for each k, yk e Rd and vk is a vector of independent N(0, 1) random variables. We assume that Dk is a symmetric non-singular d x d matrix.
The vector observations yk represent the received data or signal and constitute the information which is available on the state of the signal of interest xk. For a system which includes a Gaussian noise component, the variable vk and the parameter Dk represent that noise component which needs to be determined so that the state xk can be determined.
To execute the E-step, instead of using a Kalman smoother, a filter is used to derive the sum of the state data and the sum of the covariance of the state data. If we let et, βj denote the unit vectors in Rm with 1 in the ith and jth position respectively, and let en be the unit vector in Rd with 1 in the nth position, for i e { 1 , , m} and n e { 1 , , d}, we can define the scalar processes
(3)
Figure imgf000007_0003
Figure imgf000007_0001
where (,> denotes the inner product.
The Hk represent covariance sums and the Jk represents the sum of the state multiplied by the observed data yk and is hereinafter referred to as the state sum. The filter derives the conditional expectations
Figure imgf000007_0002
recursively, and the data produced by the filter is then used in the maximisation step of the EM process to derive or estimate the parameters Ak, Bk, Ck and Dk of the system, from which the state of the signal of interest xk can be determined. The expectations are related to densities of the distribution of the state and can be determined from those densities. The accompanying Appendix A describes the relationship between the densities and the required expectations, and explains the derivation of filter equations for the expectations. The filter for determining the expectations is configured to process the observed data and other data as described by the equations.
The densities αk, β(0) k, β(1) k, β<2) k and γk are defined by equation 11 in Section 4.1 of Appendix A. They can be determined by the integrals specified in equations 13 to 17, yet the integrals are not closed and normally require a fixed interval smoothing process, such as that executed by the Kalman smoother with all its inherent disadvantages to provide estimates of the densities. Theorems 5.1, 5.2 and 5.3 however specify filter equations for the densities, and Theorem 5 4 uses the relationship between the densities and the expectations to derive finite-dimensional filter equations 47 and 48 for the expectations The coefficients of the filter equations au(M) k, bIJ<M) k and d'J(M) k can be determined from equations 71 to 79 using the observed data and the data obtained trom a Kalman filter, being μk, Rk and Rk| k.!, as recited in equation 70, where Ak, Ck and Dk are assigned initial parameter estimates. The filter equations 47, 48 and 71 to 79 each stipulate the configuration of a respective finite-dimensional filter which together form a finite- dimensional filter 14 to generate the expectation data
The expectation data is then used in the EM process to obtain the parameters A, C,
B and D of the system, as recited in equations 84 to 87 ot Appendix A The expectation data is obtained by retaining the summations recited in equations 47, 48 and 70 to 79 as a forward pass is made across the observed data without requiring data to be retained at each discrete time interval for each iteration of the forward pass Also no backward iteration is required to obtain the expectations.
The EM process executed using the new filters, as shown in Figure 1 , involves the observed data yk 10 being applied to a Kalman filter 12 and the finite-dimensional recursive filter 14. The Kalman filter 12 processes the observed data yk 10 to generate conditional mean data and conditional covariance data of the state, μk, and Rk. This is applied with the observed data yk to the finite-dimensional recursive filter 14 which generates the conditional expectations in accordance with equations 47 and 48 of Appendix A The expectations are then passed to an M-step process module 16 to generate estimates of the parameter in accordance with equations 84 to 87 The parameter estimates generated by the M-step process module 16 are also used by the Kalman filter 12 and the finite- dimensional filter 14 for further iterations. Although the finite-dimensional filter 14 normally executes a number of forward iterations, parameter estimates can be obtained using one forward or real time pass of the observed data.
The finite-dimensional filter 14 includes first, second and third coefficient modules 18, 20 and 22, and first, second and third delay units 24, 26 and 28 for the coefficient modules, respectively. The coefficient modules 18 to 22 receive the conditional mean and conditional covariance data of the state, μ and Rk, from the Kalman filter 12. The first coefficient module 18 generates and ouφuts updates of the coefficients a<M) k+ 1 according to 5 equation 71 based on the data received from the Kalman filter 12 and the coefficients a{M> k, b(M) k, d(M) k outputted from the delay units 24, 26 and 28, respectively. The second coefficient module 20 generates and ouφuts the coefficients b(M) k+ι according to equation
72, 74 and 76 based on the inputs from the Kalman filter, the previous b(M) k provided by the delay unit 26 and the coefficient d(M) k ouφutted by the delay unit 28. The third
10 coefficient module 22 generates and ouφuts the coefficients d(M) +ι according to equations
73, 75 and 77 on the basis of the data from the Kalman filter 12 and the previous coefficient d<M) k provided by the delay unit 28. The coefficients a(M) k+ I, b(M, k+ι and d(M) k+ I generated by the modules 18 to 22 are provided to a filter covariance and sum module 30 which generates the covariance and state sum expectations according to equations 47 and
15 48.
The filters of the filter 14 are finite-dimensional because of the following two properties that hold at each time instant:
1. The filtered density of the current time sum of the state γk is given by an 0 affine function in x times the filtered state density αk as per equation 44.
The filtered state density is a Gaussian in x with mean and covariance given by the Kalman filter equations 24 and 25.
2. The filtered density of the current time sum of the state covariance β(M) k, is a quadratic in x times the filtered state estimate αk per equation 28. 5
The filtered density of the state sum is given in terms of four sufficient statistics, namely the two coefficients of the affine function in x and the Kalman mean μk and covariance Rk. Similarly the filtered density of the covariance sum is given by five sufficient statistics.
30 The above "closure" property also holds for higher order statistics as well. The filtered density of the current time sum of the p-th order statistic of the state is a p-th order polynomial in x times the filtered state estimate. Thus in general finite-dimensional filters can be derived for the time sum of pth order statistics of the state. For the filtered E-step only the filter for the first and second orders is used.
The EM process and the filters used can be implemented by an application specific signal processing circuit, but normally would be implemented using one or more microprocessors and a number of iterative software routines to process the observed data or signal. The EM process based on the use of the filters 12 and 14 has significant advantages over the EM process which uses an integral smoother, such as the Kalman smoother. The filter-based EM process provides significant savings in memory requirements, as it eliminates the need for a backward iteration and only summations need to be obtained, therefore removing the requirement for estimates to be maintained for each time interval for successive iterations.
The filter based process is also twice as fast as the previous smoother based EM process as no forward and backward iteration method is required. Parameter estimates for the process also converge faster, which is particularly advantageous when processing observed data as it is received. The finite-dimensional filters of the filter 14 are also decoupled, in that one does not depend on the other, and therefore can be implemented in parallel on a multiprocessor system to further enhance the speed of the process, as discussed in Section 9 of Appendix A.
A recursive version of the EM process can be implemented as shown in Figure 3.
Here at each time instant with each new data point, the Kalman filter 12 and the finite- dimensional filter 14 are used to compute the required statistical data, and the model parameter estimates are updated by an M-step process module 32. The resulting process is recursive in that the parameters are updated by the M-step module 32 at each time instant, rather than the end of the pass of the available data points as in an off-line case. This real
SUBSTΓΓUTE SHEET (Rule 26) time joint parameter and state estimation can be used to track time varying parameters in real time multi-sensor signal enhancement, and high resolution localisation of narrowband sources, as discussed hereinafter. Other applications of the filter-based EM process include linear predictive coding of speech and estimation of dynamic shock error models as discussed hereinafter.
When enhancing a desired signal in the presence of noise, multiple sensor measurements will typically have components from both the signal and the noise sources. Since the systems that couple the signal and noise to the sensors are unknown, one must deal with the more difficult problem of joint signal estimation and system identification.
A frequency domain approach to the two-sensor signal enhancement problem is discussed in M. Feder, A. V. Oppenheim and E. Weinstein, " Maximum Likelihood Noise Cancellation using the EM Algorithm" , IEEE Trans. Acoustics Speech and Signal Processing, Vol. 37, No. 2, pp. 204-216, Feb. 1989. In that approach, the desired (speech) signal is modelled as an auto- regressive (AR) Gaussian process, the noise is modelled as a white Gaussian process, and the coupling systems are modelled as linear time-invariant finite impulse response filters. In order to deal with the non-stationarity of the signal, the noise, and the coupling systems, it is suggested in the paper that an algorithm be applied on consecutive time frames using a sliding window. This approach involves two contradicting requirements. The window should be short enough so that the algorithm will respond to non-stationary changes in the signal and noise statistics. However, the window should be long in order to improve the statistical stability of the resulting signal and parameter estimates and in order to obtain a computationally tractable algorithm in which non-causal frequency domain Wiener filtering can be applied.
Using the filter-based EM process described herein, a time domain approach can be adopted, whereby the process jointly estimates the signal, the noise, the coupling systems and the unknown signal and noise spectral parameters. The advantage of this time domain approach is that many of the computational and conceptual difficulties associated with the frequency domain approach are avoided. An example of the time domain approach for an observed signal sk where vk represents white Gaussian noise is described on pages 31 and 32 of Appendix A.
5 A two sensor speech enhancement system 32, as shown in Figure 4, uses two noisy sensors (microphones) 34 and 36 to receive a speech signal in ambient noise. The noise received at the two sensors 34 and 36 affect the received speech signals and are correlated with different correlations as the sensors 34 and 36 are in ditferent locations. The aim is to optimally combine the two sensor ouφuts to extract optimal estimates of the speech signal 0 (state) and speech and noise parameters This is achieved by using the filter-based EM process in a module 38 which receives the digitised outputs of the sensors 34 and 36 Source code in Matlab for the module 38 is given in Appendix B. For a simulation using the source code of Appendix B the signal source was assumed to be an AR(1) process with AR coefficient a! = 0.75, δ2 v = 0.01 for the observed signal sk of equation 88 of 5 Appendix A. The sensor (observation noise) variances were assumed to be δε = 0.0025. The filter-based EM module 38 was run on 1000 data points and the initial AR parameter estimate was randomly initialised to 0.941. Figure 5 shows the AR parameter estimates versus successive passes and the parameter estimates are very close to the true parameter value. A recursive version of the process can be also used to achieve real time (on-line) 0 speech signal (state) and parameter estimates. This can be used to track time varying signal and noise statistics. The smoother based EM process requires of the order of N2T memory (wherein N is the order of the AR process and T is the number of data points) whereas the filter based process requires no storage memory apart from local variables. Also the various filters are decoupled and hence can be implemented in parallel. A systolic 5 implementation is described in Section 9 of Appendix A.
For high resolution localisation of narrowband sources, the use of the EM process is described in I. Ziskind and D. Hertz , "Maximum Likelihood Localization of Narrow- Band Autoregressive Sources via the EM Algorithm" , IEEE Trans. Signal Processing, Vol 30 41 , No. 8, pp. 2719-2723, August 1993. The filter-based EM process described above can be used on a Gaussian state space model to estimate the parameters of model signal sources and their direction of arrival. In our notation the multi-sensor multi-source signal model in Ziskind and Hertz can be expressed as:
yk = C(θ^)x, + Dvk W
The state equation relates to the narrowband sources. In particular, θ{1) are parameters that determine the bandwidth of the sources. In the observation equation, θ(2) denotes the angles of arrival to the multi-sensor array. C(θ(2)) is called the steering matrix of the array towards direction θ(2). A Kalman smoother is used to obtain maximum likelihood estimates of the parameters θ(1), θ(2) and noise variances B and D. The filter-based EM process can be used instead of the Kalman smoother by using a filter-based EM process module 40 to receive the signal data observed by a sensor array 42 in response to the signals generated by a plurality of narrowband sources 44, as shown in Figure 6. The module 40 is able to provide estimates of direction arrival θ<2), bandwidth θ(,) and source signal xk+ I for each narrowband source 44.
The filter based EM process can be used, for example, in a chemical process plant application. A plant 50 may include a three-tank level and temperature system, as shown in Figure 7, which is versatile and representative of typical chemical process plants. The plant 50 has two-cylindrical glass tanks 52 and 54 and one conical glass tank 56, each having an internal diameter of 16 cm, and height of 50 cm. The system is equipped with several transducers: three differential pressure cell transducers 58, 60 and 62 for measuring the flow rates of steam and water; and five thermocouples 64, 66, 68, 70 and 72 for measuring the temperature of the water at various points in the system. In addition there are three control valves 74, 76 and 78 for manipulating the flow rates of water and steam into the tanks 52, 54 and 56, and three tank level sensors 80, 82 and 84 for the tanks 52, 56 and 58. Further details of the plant are given in O.O. Badmus, D. Grant Fisher and Sirish L. Shah, "Real-time, Sensor-based Computing in the Laboratory" , Chemical Engineering Education, Fall 1996, pages 280 to 289. The process equipment of the plant 2 can readily be configured to demonstrate the performance of different process systems and control schemes. Only a subsystem of the plant 2, being the bold portion of Figure 7 for the first tank 52, is utilised for the description below. To generate data suitable for identification of a dynamic process model, a uniformly distributed random input signal of maximum amplitude 1 (0 < M < 1) was generated and applied every 5 seconds as a perturbation of the nominal value of the position of the cold water valve 74 (set at 21 % open) with an amplification gain of 10. The input signal is bounded by the interval 16% to 26% open. As described in the O.O. Badmus et al article, a noisy Gaussian AR series adequately describes the process dynamics when the input signal is assumed to be unknown and modelled as Gaussian white noise. Figure 8 shows a 2000 point ouφut data segment obtained from the level sensor 80 for the first tank 52 when the input signal is white Gaussian noise with variance 8.2687. The ouφut series was detrended by subtracting the mean, resulting in a zero-mean time series. Assuming an AR(2) model for the subsystem, the filter-based EM process was executed on the zero-mean data. Initial AR parameter estimates were arbitrarily chosen as 0.4 and 0.0. The parameter estimates obtained from the filter-based EM process are plotted in Figure 9. The estimates converge after about 10 iterations. After 20 iterations the estimates are 0.5970 and 0.4027. The parameter estimates, after only 6 iterations, are very close to the final values which stabilise after 10 iterations. This rapid convergence, combined with the fact that no memory storage of previous values is required, demonstrates the advantages of the filter- based EM.
In the linear predictive coding of speech, unvoiced speech is modelled as an auto- regressive process driven by white Gaussian noise. The auto-regressive parameters correspond to those of the vocal tract resonant cavities and change with the flow of speech according to movement of the mouth, tongue and chest. These changes are relatively slow compared to the speech signal variations. The speech signal itself is embedded in white Gaussian noise. The aim is to obtain ML estimates of the AR parameters, which can be done using the filter-based EM process. The filter-based EM process can also be used as a substitute for the Kalman smoother based EM algorithm described in D. Ghosh, "Maximum Likelihood Estimation of the Dynamic Shock-Error Model" , Journal of Econometrics, Vol. 41 , No. 1, pp. 121- 143, May 1989. Here the EM process is used to attain ML estimates of dynamic shock error model parameters for economies, such as the shadow economy in the United States.
The filter-based EM process can also be used to determine the parameters for a system to predict the spot price of oil, as described in detail in accompanying Appendix C.
The filter-based EM process has been described above in relation to discrete time systems but it can also be applied to continuous time linear Gaussian systems.
Various discretisations of the filters (e.g. first order, second order or robust discretisation) (refer to A. Bensoussan, "Stochastic Control of Partially Observable Systems" , Cambridge University Press, Cambridge, 1992) result in different discrete time filters than can be applied to the above examples.
Many modifications will be apparent to those skilled in the art without departing from the scope of the present invention as hereinbefore described with reference to the accompanying drawings.
Appendix A
There are very few estimation problems for which finite dimensional optimal filters exist, i e . filters given in terms of finite-dimensional sufficient statistics. Indeed the only two cases that are widely used are the Kalman filter for linear Gaussian models and the Wonham filter ( Hidden Markov Model filter) for finite state Markov chains in white noise.
In this paper we derive new finite-dimensional filters for linear Gaussian state space models in discrete-time. Indeed, the Kalman filter is merely a special case of our general filter Our filters compute all the statistics required to obtain maximum likelihood (ML) estimates of the model parameters via the Expectation Maximization (EM ) algorithm
Maximum likelihood parameter estimation of linear Gaussian models and other related time- series models using the EM algorithm was studied in the 1980s in [1] and [2] and more recently in the electrical engineering literature in [3], [4] The EM algorithm is a general iterative numerical algorithm for obtaining ML estimates. Each iteration consists of two steps The Expectation (E- step) and the Maximization (M-step). The E-step for linear Gaussian models involves computing the following two conditional expectations based on all the observations (i) The sum over time of the state, (ii) The sum over time of the state covariance.
In all the existing literature on parameter estimation of linear Gaussian models via the EM algorithm, the E-step is non-causal involving fixed-interval smoothing via a Kalman smoother ( I e a forward pass and a backward pass)
In this paper we derive a filter-based EM algorithm for linear Gaussian models That is, the E-step is implemented using filters (i.e. only a forward pass) rather than smoothers. The main contribution of this paper is to show that these filters are fi le-dimensionαl Few finite dimensional filters are known, so the result is remarkable. It was certainly a surprise to the authors
The filtered E-step has the following advantages.
1 The memory costs are significantly reduced compared to the standard (smoother-based) EM algorithm
2 Our filters are decoupled and hence easy to implement in parallel on a multiprocessor system
3 Our algorithm is at least twice as fast than the standard smoother based EM algorithm because no forward-backward scheme is required Filter-based EM algorithms have recently been proposed for Hidden Markov Models in ' ^ These
HMM filters are finite dimensional because of the idempotent property of the state indicator lunction of a finite state Markov chain In linear Gaussian models unlike the HM case the state indicator vector is no longer idempotent Instead our filters are finite dimensional because of the following two remarkable algebraic properties that hold at each time instant
1 The filtered density of the current time sum of the state is given by an affine function in x times the filtered state density The filtered state density is a Gaussian in x with mean and variance given by the Kalman filter equations
2 The filtered density of the current time sum of the state covariance is a quadratic in i times the filtered state estimate
So the filtered density of the state sum is given in terms of 4 sufficient statistics, nameiv the 2 coefficients of the affine function in α. and the Kalman mean and covariance Similarly tlι<= filtered density of the covariance sum is given by 5 sufficient statistics
Actually this remarkable algebraic "closure property holds for higher order statistics as well We prove that the filtered density of the current time sum of the p-th order statistic of the state is a p-th order polynomial in x times the filtered state estimate So in general we can
Figure imgf000017_0001
finite dimensional filters for the time sum of pth order statistics of the state Of course for the filtered E-step we only use filter for the first and second order statistics Also for p = 0, our filters reduce to the Kalman filter
Applications Our filter-based EM algorithm for linear Gaussian models can be applied to all the applications where the standard EM algorithm has been applied In particular these include
• Multi-sensor signal enhancement algorithms for estimation of speech signals in room acoustic environments [3].
• High Resolution Localization of Narrowband Sources using multiple sensors and Direction of Arrival Estimation [6]
• Linear Predictive coding of speech ( see Chapter 6 -[5])
• Forecasting and prediction of the ''shadow economy'' in market cycles using linear errors-in- \ arιables models [2]
In all these applications the advantages of our filter-based EM algorithm can be exploited
This paper is organized as follows In Sec 2 we present our signal model In Sec 3 we propose a measure change which facilitates easy derivation of the filters in Sec 4, recursions are deriv ed for the filtered densities of the variables of interest In Sec 5 we derive our finite dimensional Sec I) a general finite dimensional filter is proposed Sec 7 re-e.\presses the filters to allow to
Figure imgf000018_0001
i sinc.ui slat* noise as long as a certain controllability condition is satisfied In Sec {>, the filters derived m Sec 4 and 6 are used in a filter-based EM algorithm to obtain maximum likelihood parameter estimates In
Sec 9 we evaluate the computational complexity of the filters and propose a parallel implementation
Finally conclusions are presented in Sec 10
2 Signal Model
To derive our filters with maximum generality we consider a multi-input multi-output state space model with time varying parameters and noise variances as follows
All processes are defined initiallv on the probability space [V.,T P) We shall consid r the classical linear-Gaussian model for the signal and observation processes That is for L = 0 1 , the signal is given by the dynamics
_ + ι = Ak + ι k + Bk+ι ωjt + i (1)
Here ∑k € Rm and ι0 € Km is a Gaussian random variable with zero mean and covariance matrix B2 €lm m
At time k -- 1, k — 0, 1, , the noise in (1) is modelled by an independent Gaussian random variable with zero mean and covariance matrix B\+i It is known [7] that such a Gaussian random variable can be represented as Bt + i u>t + ι where un+i is an rτ?-vector of Λ'(0 1) random variables
Assumption 2.1 For the time being, we assume that the matrices Bt € ?mxm ι k = 0, 1 , are nonsmgular and symmetric The symmetry follows from the construction in [7] The case when Bk is singular is discussed in Section 9
For Ar = 0,1,. , we assume that the state process x* is observed indirectly via the vector observations yjt, where
Figure imgf000018_0002
Here for each Jfc. yι £ ϋld and m. is a vector of independent V(0 1) random variables We assume that Dk is a nonsmgular d x d matrix
Remark We assume B to be a covariance matrix and hence symmetric tor notational convenience The results in this paper also hold for non- ymmetric B simply replace B- bv B B' and B~ by (BB')~l below 3 Measure Change Construction and Dynamics
We shall adapt the techniques in [10] and show how the dynamics (ll -ind (2) can be modelled starting with an initial reference probability measure P
Suppose on a probability space (Ω, ", P) we are given two sequences of independent, identically distributed random variables n € Em, yt € Rd. Under the probability measure P, the Xk are a sequence of independent m-dimensional N(0,Im) random variables, and the i/jt are a sequence of independent d-dimensional N(0,Id) random variables. Here, 7m (resp Ij) represents the m x m (resp. d x d) identity matrix.
For x Km and y 6 ? write
c(x) = (2π)_m 2 exp{-x'x/2)
Figure imgf000019_0001
Define the sigma-fields
Qk = σ{x0,xi: .. ,χk,yo,yι, ,y*} yk = o-{y0,y .-. ,Vk) (4)
Thus Q is the complete filtration generated by the x and y sequences and ^ is the complete filtration generated by the observations.
For any matrix B let \B\ denote its determinant
Write φ(D-l(y0 -C0x0)) λπ =
\D0\φ{yo)
and for / > 1
Figure imgf000019_0002
For k > 0 set
A new probability measure P can be defined on (Ω.Vi.tJjt) by setting the Q restriction of the Radon- derivative of P with respect to P dP_ d~P Definition 3.1 For I = 0,1, define
vι = Dl l{ ι - Cixt)
For / = 1,2, . ., define
Figure imgf000020_0001
Lemma 3.2 Under the measure P, vi and wι are sequences of independent /V(0,/d) and N(0,Im) random variables, respectively
PROOF Suppose / . Rd — R and g . !Rm — R are arbitrary measurable 'test' functions Then with E (resp E) denoting expectation under P (resp P)
E{g(Wk) fMGk-i} = ~)
Figure imgf000020_0002
using a version of Bayes' theorem [10]
Now Λfc-i ιs k_ι measurable, therefore
Figure imgf000020_0003
However,
Figure imgf000020_0004
(8)
Notice that the inner conditional expectation is
Figure imgf000020_0005
Hence
E{λt|Gt._,} = -i- (fl-'(xt - Ak rt_ ) dxk = 1
Consequently,
Figure imgf000020_0006
f(D;llyk -Ckxk)) Gk-l, k Gk- (9)
Figure imgf000020_0007
Then for any measurable "test" function g
0.1.2. (12)
Figure imgf000021_0001
The following theorem gives recursive expressions for the k, k and ~;k. The recursions are derived under the measure P where {x,} and {j/;}, / € 2Z+ are independent sequences of random variables.
Theorem 4.1 For k € + , the un-norma zed densities /^ . M = 0.1.2. -|." and αt defined in (11) are given by the following recursions
Figure imgf000021_0002
+ (x,el)(x,eJ) k.l(z)Φ{B7l(x-Akz)) dz (13)
-ύ(i). β ?k'ϊ.0Λ:)v(Bt- -i, - Ak -)) dz
Figure imgf000021_0003
+ (x,cΛ [ ( ,eJt_l(;)ι/)i-'(i'- lt:)) i (14) Js™ -I β.m{:)lp{B-l{x_Λk z)) dz
Figure imgf000021_0004
+ / (:.c,){:,e])ak-l(z)φ(B 1{x-Akz)) dz (15)
ι\n{χ) ø(g t-'(y t-ct χ))rr (βri(l-..4tr)) d.-
|β*l|D* (yt) LΛ- ^*l *
+ ( .c,){yt,en> / -.*-ι(z) '(β 1(ι- it-)) <'- (16)
ak{x) \ ak-ι(z)φ{Bk (x-Akz)) dz (17)
\Bk\\Dk\Φ(yk JS." PROOF. We prove (13). The proof of (14), (15), (16) and (17) are very similar and hence omitted
Since H rk'JlO) = // r'.;_(θ,ι - \xk.tt) (x ,C]). using (δ) we have
Figure imgf000022_0001
Φ(D7l(yk-Ckxk)) φ{B7l(xk-AkXk-ι))
+ E<^Λit_1 (xt,e,) (xk. tj)) y(xk )\yk
\Dk\φ(yk) \Bk\φ(xk)
Figure imgf000022_0002
- EfAt.! / o(D7l{yk-Ckx)) ψ{Brl{x-Akxk-l)) (x, e,){x. e,) g(x)dx\y
= ιaιιaι«wι [/,./.. tf'"")*<°'""'-c"))*'fl''"-"--|)"J|lt"'
+ / / «»-il--)o(Dr'(» - ι)) (■(β;'(J'-.-U--)) (r.e,!!i.<-,)» I<<"J-- :i8)
where the second equality follows from the independence of the ^'s and yk 's under P. Since g is an arbitrary Borel test function, equating the RHS of (12) with (18) proves (13)
D
Remarks:
1. By virtue of (17), we can rewrite (13) and (16) as:
WW = dz+(x,e,)(x.e])ak[x)(19)
Figure imgf000022_0003
- 7*ιπ'U^) - = 0 {Dk {yk-CkX) [ γk"_i(x)φ{Bk-1(x-Akz)) dz + ( ,e1)(!/,en)Qjt(ι)(20)
\Bk\\Dkφ(yk) jM
2. The above theorem does not require v\ and wι to be Gaussian. The recursions (17), (13) and (16) hold for arbitrary densities ψ and φ as long as o is strictly positive. We use the Gaussian assumption to derive the finite dimensional filters in Sec.5
3. Initial conditions: Note that at k — 0, we have for any arbitrary Borel test function <7(x)
E{A,J(x)lW
Figure imgf000022_0004
Equating (11) and (21) yields
, . φ{Dόl(y0-C0x))
(22) \Do\Φ(yo)
Similarly the initial conditions for βk , .1/ =0,1,2 and -.{." are
; ,0,(xl = (._, <,)(*, <,)«,,(*) / (1)W = 0
; (21(x)
Figure imgf000022_0005
(r.e,)(ι/o,tn/uo( ) (23) 5 Finite dimensional Filters
In this section finite dimensional filters are derived for H[]{ \1 - U 1 2 and }[" defined in (10) In particular, we characterize the densities k, β[l M and n in terms of a finite number of sufficient statistics Then recursions are derived for these statistics
Define the conditional mean and covariance matrix of x , respectively as μk = E{xk\yk) and
Figure imgf000023_0001
The linearity of (1) and (2) implies that ak{x) is an un-normahzed normal density with mean and variance given by the standard Kalman filter equations
Theorem 5.1 (Kalman Filler) For k — 0,1, ak is an un-normahzed Gaussian density with mean in and xariancc R given via the standard Kalman filter equations
μk = Rk B 2 Λkσ7l RT^μk^ + Rky'k (Dk D[)-' C. (24)
Rk = [(AkRk^A'k + Blrl- Ck' (Dk D[fl θ]~ (25)
where μk e ?.m, Rk £ Kmxm and
σk = A'kB72Ak+R _x (26)
Her μL = E{xk\yk) and Rk = E{(x - μk)(x - μk)'\yk}
PROOF See [9] □
Remark Consequently, ak has the form
k(x) = άk (2ι)-m'2fc|-1 2 exp - μk)' R - μt)) (27)
Figure imgf000023_0002
where άk — JRm ctk(x) dx
Due to the presence of the quadratic term (x,et) (x,e_,), the density βk in (19) is not Gaussian Is it possible to characterize the density β in terms of a finite number of sufficient statistics7 Amazingly enough, the answer is "yes" As we prove below it is possible to express
Figure imgf000023_0003
as a quadratic expression in x multiplied by ak(x) for all k The important conclusion then is that by updating the coefficients of the quadratic expression, together with the Kalman filter above, we have finite dimensional filters for computing H[ \ simiiai result also holds tor 3(l H and Jk n
Theorems 52 and 53 that follow derive finite dimensional sufficient statistics for the densities l3llM \I =0,1,2 and [n Theorem 5.2 At time k. the density ) (initialized according to ι ])) <s completely defined by the 5 statistics a i.}( \!) ,i]( ) ,.j, I
Rk and μ as follou.
t?k'{M)(x) = (28)
Figure imgf000024_0001
where a3{M) G 1, b[3{M G Rm and d[3(M) G R'"*"1 (5 α symmetric matrix with elements dk{p,q) p = 1. . , m, g = 1, .. ,m.
Furthermore, ak ,
Figure imgf000024_0002
by the following recursions
'J(M) _ t](M) "lt+1 - "i p=l »=1
A-μkκk σ
Figure imgf000024_0003
(29)
,ι (U) _ D-2 , _| .ιj(U) .. .i^ Ol - l w-l „ ι, ιθ)
= „ (30)
/ 'JzO) _ = * D*-+2. > „ .+, * _*-+l. C ,»j(0)^ _ι ,, > Λ;+. % + 51 («. «; + *, o d'3 (0) e, - _ f.
^ 1)
+ e. e, σJt"β (,'Jθ = 0n (32)
J' 0) _ 0-2 . -1 , (l) -1 ./ p-2
d'3(1) - 0 (33)
+ 2 te' e σ*+ι -4'l + ι β*+> + fl*+ι ljt+1 σ*+ι e> e'5 ι>j(2) _ , -1 (h'](2) , (9 ,>J(2) , ^
°t + l - °fc + l Λ* + l σt + l ^°* + '2"t + e< e; + ej e έul2) - 0 (34)
J>J(2)
O = ^> Ak + i σ;l, (df + I [e, e; + e,]) 7 _4't + I B Q (35)
where σk is defined in (26) and μk, Rk are obtained from the Kalman filter (24) and (25)
PROOF. We only prove the theorem for M = 0, the proofs for M = 1.2 are very similar and hence omitted.
We prove (28) by induction.
From (23), at time k = 0, β0'l{0)(x) is of the form (28) with α'0 j(0) = 0. i'0 j(0) = 0 and d'j{0) =
(e,e +eje;)/2.
'J(O) .ιj(0) ,ι;(0) 3« (0|
For convenience we drop the superscripts (.j,(0) in ak \ 6λ , dk and βk + l Assume that (28) holds at time k Then at time k + 1, using (28) and the recursion (13) we have
βk + χ{x) - v(BL + l(x- A + z))(ak-rbkz + z dk z)ak(:)dz
Figure imgf000024_0004
+ (x,e,)(x,e;) / v>(Bk + l(x-Ak + l :)) ak(;)dz (36)
Figure imgf000024_0005
Jr. Let us concentrate on the first term on the RHS which we shall denote as l\
λ" ") J,
Figure imgf000025_0001
-{(x- κ+ι--)'βr+ι - 4ι+ι --) + (-- -/'»)'?: t' ):
( αt +6'( : + ;' tit.) dr
(37)
Λ'ι(x) / exp -- (c'σι + ι ; - bk'+l z) (ak + b : + z d z)dz JR"> L - where σt is deined in (26)
Figure imgf000025_0002
m = I ak(z)dz
Λ'ι(x) =
Figure imgf000025_0003
-- {*' Bklx x + μ',+1 fljrj,/.t+i! (38)
Completing the "square' in the e.xponential term in (37) yields c; _- 1 cl + l
11 — exp 1 / "l- l ft + 1
{uk+b z + z'dkz)dz (39)
Figure imgf000025_0004
Now consider the integral in (39)
1 / σ +lδk+ι σl+l**+l
(ak +b'kz + :' dkz)exp 2 ; "~k* +τ l' v" 2
(2πr/2t+1|-1/2l+6;E{c} +E(.' :}) (40)
since theexp( ) term is an un-normalized Gaussian density in z with normalization constant (2 τr)m/2 l<τjc + ιl l*~ So σ kll δ +i
E{.} = (41)
E{z'dkz} = E{(.- -E{.-})' .t(z-E{z})} + E{r'}rftE{--}
= + i (42)
Figure imgf000025_0005
Therefore from (39), (40). (41), (42) and (36) we have
-H(X) = <n + ι(x) uι + k(p,q)σk ιip. )
Figure imgf000025_0006
+ ^ ^l + i σdkσδ*- + ι + -1' e, Cj x (43) Substituting for δk+l (uhicli is affine in x) in (43) we obtain
A+l(- = (O-L + l Tυ{ + \ -i -x' c/n + 1 - )Ol+|(j ) where αj. + 1, _>t + 1 and (fn + i are given by (29), (30) and (31 j D
The proof of the following theorem is very similar and hence omitted
Theorem 5.3 The density ~f[n{x) is completely determined by the J, statistics
Figure imgf000026_0001
, b\" , μk £ Km and Rk£Emxm as follows
yk n(x)= [άk n +b ~\n'x] ak(x), -;i"(x)=(x e,)(Vo e„)α0(x) (44)
where ά[" ξP. If £ ?m are given by the following recursions
Figure imgf000026_0002
*ϊ+ι = B lι ^ + 1 σt~έI" + e- (yt + i.en), = e.{jo.e„) (46)
Having characterized the densities ?J (x), Λ/ = 0, 1 2 and ~:L'n(x) by their finite sufficient statistics, we now derive finite dimensional filters for H' and J[n
Theorem 5.4 nire dimensional filters for H1 ' M - 0 1 2 and }[n are given by
E{H'3(M))yk} = a M) + b M),μk + (47)
Figure imgf000026_0003
Figure imgf000026_0004
PROOF Using the abstract Bayes rule (7) we have
Figure imgf000026_0005
where the constant A' = JRm ak(x)dx But since Qjt(x) is an un-normahzed density, from (28)
/ / β ak' (M) '{,x)d ,x = A T- E n{ iθ ιjj(M) + , b ,tkj(M)ι x + , x i d jtkι(JM) ' x l)
Figure imgf000026_0006
Substituting in (49) proves the theorem
The proof of (48) is similar and hence omitted D
6 General Filter for Higher Order Moments
Theorem 54 gives finite dimensional filters for the time sum of the states J[n and time sum of the square of the states H[3 In this section we show that finite dimensional filters exist for the time sum of any arbitrary integral power of the states Assumption 6.1 I or notatonal simplicity, in this section ικ assume that the state and obsenation processes are scalar lalued i c m — d — 1 I (1) and (2)
Let H be the time sum of the pth power of the state '
(51)
1 = 0
Our aim is to derive a finite dimensional filter for Hk
Define the un-normalized densi βk(x) = E{ H I(ι 6 dx)\yk)
Our first step is to obtain a recursion for β (x)
B\ using a proof
Figure imgf000027_0001
similar to Theorem 41 we can show
Figure imgf000027_0002
Our task now is to characterize βk(x) in terms of finite sufficient statistics Recall that for p = 0, the Kalman filter state and covariance are sufficient statistics as shown in Theorem 51 Λ.lso for p = 1 and 2 Theorems 53 and 52 give finite dimensional sufficient statistics We now show that βk can be characterized in terms of finite dimensional statistics for any p £ 3 +
Theorem 6.2 At time k, the density βk{x) i (52) is completely defined by the p + 3 statistics α t(0) a (ϊ) ,αjt(p), Rk and μk as follows p
A(χ) = £ *(*)*' atk(x) (53)
Lι=0 where
αt+ι(n) ΣΣ a (ι)η> {R;lμky~ t+iB*- ,) o<n<P
■=»)= αn-ι(p) = l -αj( )τjpp (Ak + { Bk + 2 i) \P (54)
and
()) 1 3 (i - J - l)σ;!v ' «/ι - J » et>en i > ;
*?.; = 0 if i - j is odd, i > j (55)
Figure imgf000027_0003
PROOF λs in Theorem 52 we give an proof At k — 0 Jk{x) = xp Qo(-ι) and so satisfies (53)
'These new definitions for //* in (51) and 0k{ι) in (52) ate onk used in this section Assume that (53) holds at time it Then at tini^ k + 1 using similar arguments to I heorem ό 2 we
Figure imgf000028_0001
The first term on the RHS of the above equation
1 tl + i kli + i
A = A'ι(x) exp
Figure imgf000028_0002
The integral in the above equation is
Figure imgf000028_0003
Now recall from (41) that E{z] is affine in x
Figure imgf000028_0004
Also E{(; - E{z})2) is independent of x Indeed. ([11], pg 111)
Figure imgf000028_0005
E{(*-E{z})-'} = 1 3 (! -J- l)σέ ι if i - is even, i > j (60)
Figure imgf000028_0006
Thus
A + ι(z) <*k + l{x) ∑∑∑ak(ι)η } n) (R7iμky-"(Ak+<B7ll)nχ» +χr
Figure imgf000028_0007
Eq (61) is of the form (53) with αt+ι(ι), i = 0 , p given by (54) D
7 Singular State Noise
The filters derived in Theorems 51, > 2 and 53 have one major problem They require B to be invertible In practice (e g see Sec 8), Bk is often not invertible
In this section, we will use a simple transformation that expresses our filters in the terms of the inverse of the predicted Kalman covariance matrix This inverse exists even if Bk is singular as long as a certain uniform contiollabiliu condition holds Both the uniform controllability condition and the transformation we use aie well known in the Kalman filter literature [11] Chapter 7
Define the Kalman predicted sιaκ covariance as R \k~\ = E{(xjt - μk){n — μk)' |.W-ι } It is straightforward to show that
Rk k-l = Bl+AkRk-lA'k (62)
Our first step is to provide a sufficient condition for Rk\k-\ to De non-singular
Definition 7.1 (Chapter 7, [14J ) The state space model (1), (2) is said to be uniformly completely controllable if there exist a postπc integer \\ and positive constants a 3 such thai
a I < ('fit k - Λ'ι )< 31 for all k > Λf, (63)
Here
C[k k- ι)= ∑ o(k,l+l)B,B,'ώ'(k,l+l) (64)
/ = i-Λ,
4t..4l,_ι J.. + 1 ιfk2 > kι Φ(k2 kl)= i (65) / ιfk2 = kι
Lemma 7.2 // the dynamical system (1), (2) is uniformly completely controllable and RQ > 0 then Rk\ and Rk\k-\ are positive definite matrices (and hence nonsmgular) for all k > Nι
PROOF See [14], pg 238 Lemma 73 Q
Our aim now is to re-express the filters in Sec 5 in terms of Rk\ -\ The following lemma will be used in the sequel
Lemma 7.3 Assume
Figure imgf000029_0001
exists Then with σk defined in (26),
Figure imgf000029_0002
PROOF Straightforward use of the matrix inversion lemma on (26) yields
σ l = Rk-i - Rl.lA'k[Bl + Ak Rk., A'k l AkRk-ι (68)
Substituting (62) in (6S) proves (66) To prove (67), first note that
# +i Ak + Ϊ σ7^ = B7^ lλ + 1 Rk - B7^ Λk + ι Rk A'k + l β ( "+ιμ + i Λι
= Bk + i Λk +
Figure imgf000029_0003
+ ι β<- because . ;. + 1 Rk A[.^ = Rk + ι\k - Bl + l from (62). So
! + i - V-M σk t - 7^{ Ak + ι Rk - B ^{ Ak + l Rk 4- k k Ak + ι Rk
= «i+1|.^+. it . (69)
D Applying the above lemma to the filters derived in Sec.5, we now express them in terms of Rk\k-\ instead of B . The following theorem summarizes the main result of this paper in terms of the new finite dimensional filters:
Theorem 7.4 Assume that the linear dynamical system given by (1) and (2) is uniformly completely controllable, i.e. (63) holds. At time k, the density βk (x) (defined in (11) and initialized according to (23)) is completely defined by the 5 statistics a3 . bk . d'k 3 . Ii and μk as follows:
β ,-,ιk J(K.W) (,x .) = f \a iki'fΛ ) + , o ,ikι(Λf)' x + . x i d ,ιkjlM) x ] αjt( ,x ,), M ,f = n 0, ,1, n2
where a' £ R, ^ £ L?.m, dk £ S m m i a symmetric matrix with elements dk(p,q), p = 1 m.7 = 1. . ,m and k(x) is an un-normahzed Gaussian density with mean μ £ Km and covariance Rk £ ~.mxm.
Also the density 7j."(x) (defined in (11) is completely determined by the 4 statistics άf , b'" as follows:
y[n(x) =
Figure imgf000030_0001
+ lt n'x] ak(x), y0'n(x) = (x.el)(y0,en)ao(x)
where £ ?., bl k £? ~ m.
Furthermore μk, Rk, ak , 6( t' , dk , M — 0, 1,2, άk and ik are given by the following recursions (where σ l is given by (66)): Kalman Filter Equations:
μ = Akμk- 4- Rk\k-ιCk' [Ck Rk\k-\Ck' + Dk D'k\ (yk - Ck Ak μk.x) R = Rk\k-> -Rk[k-χC"k {CkRkμ-1Ck + DkD, k)-i CkR -i Λ.lt-ι = Bi' + Av Rk-ιA'k (70) Finite dimensional filters for a[,). bk" d[" , M = 0.1.2. άk and hk
Figure imgf000031_0001
+, ^ -l,^ „-ι j («)) ,-ι,^ p-i « , (W)' = _ o (71)
Figure imgf000031_0002
"t+i - lik + ι\kAicHkdk Λt^t + i /cιt + ιμ + 2 ^ 2
(73) b^ = R illk k + i Rk ( l) + 2d l σ7llR7i μk) -r etej'σklχRk-l μk, 6o (I, = 0mxl
(74)
rf*'Vι = βϊ+ιμ -'^' + t Λt ; β*- A'k+ι Rk + ]\k
+ [*. e; Λt
Figure imgf000031_0003
-* + • R* e> } ' " = °— <75)
W 0 JB x 1
(76)
+"'
Figure imgf000031_0004
m (7-
Figure imgf000031_0005
in +1 =
Figure imgf000031_0006
+ e,{ϊt+llen)1 60" = e,- {y0,en) (79)
F»nα//y zniϊe dimensional filters for Hk'l(M) and Jk'n (defined in 10) are given by (41) and (48).
PROOF First consider the Kalman filter equations (24) and (25). Using Lemma 7.3 on (25) gives
Λt = (βrl 1 t_1 + cirt-1ct)"1. (80)
Using the matrix inversion lemma on (80) and applying (67) to the first term on the RHS of (24) yields the "standard" Kalman filter equations.
Similarly applying (67) to the RHS of (30), (31) and (46) yield (72) and (73) and (79). D
Singular State Noise
Subject to the uniform complete controllability condition (63). the filtering equations in Theorem 7.4 hold even if the matrices B +
Figure imgf000031_0007
is singular
1. Add e κ /V(0, 1) noise to each component of xfc + ,. This is done by replacing Bk+i in (1) with the non-singular matrix Bk+l = β|t + 1 + c Im where e £? ".. Denote the resulting state process as x'+i 2 Define R[ + 1ιk as in (62) wit h Bk + l eplaced bv B{ + 1 Express the hlters m terms of R[ + l i , as in Theorem 7 4
3 As e — 0, R\ + 1\k — Rk+ι\k
4 Then using the bounded conditional convergence theorem (pg 214 [15]), the conditional estimates of x[ , xk xk',
Figure imgf000032_0001
and Jk n(xc) converge to the conditional estimates of xk , xk x'k,
Figure imgf000032_0002
and Jk'n(x) respectively
8 Filtered EM Algorithm for Gaussian State-Space Models
The aim of this section is to derive a filter based EM algorithm for computing maximum-likelihood parameter estimates of a linear Gaussian state space system The finite dimensional filters of Section 7 are used in implementing the E-step resulting in a filter-based EM algorithm Consider the time-invariant version of the state space model ( 1), (2)
j.i. + ι = A xk + B wk + Ϊ (81 ) yk = C xk + D vk (82)
Our aim is to compute ML estimates of the parameters θ = (A, B2 , C, DD') given the observation sequence j θ ι , VT (Identifiability and consistency in special parametrized cases are discussed at the end of this section) We do this ia the EM algorithm
EM Algorithm
Suppose we have observations {j/i , yτ \ available, where T is a fixed positive integer Let { Pβ , θ £
Θ} be a family of probability measures on (Ω, F) all absolutely continuous with respect to a fixed probability measure PQ The likelihood function for computing an estimate of the parameter θ based on the information available in y? is
Figure imgf000032_0003
and the maximum likelihood estimate ( MLE) is defined by
θ £ argmax £(0)
The EM algorithm is an iterative numerical method for computing the MLE Let 00 be the initial parameter estimate The EM algorithm generates a sequence of parameter estimates as follows Each iteration of the EM algorithm consists of two steps Step 1 (E-step) Set 0 = θ, and compute Q( ,0), where dP
Q(θ,θ) = Ei[\og dPi yτ]
Step 2 f M-step) Find θ] + l £ argmaxQ( ;)
The sequence generated {θ} , j > 0} gives non-decreasing values of C(θ3 ) with equality if and only
Figure imgf000033_0001
It is shown in the appendix that
Q(θ θ) = -T\og\B\-{T+ 1) log| _ IE J ∑(xι -Ax,- )'B-2t - Ax,.ι)\yτ >
-E Σ (yi - C ι) D D' 1 (y, - Cx,)\yτ \ + E{R(0)\yT} (83)
where R(θ) does not involve θ
To implement the \l-step we set the derivatives dQ/dθ = 0 This yields (using the identity 31og|Λ/|/5.\/ = Λ/_1 for any non-singular matrix M)
i (84)
Figure imgf000033_0002
Figure imgf000033_0003
(II
B2 = E{
Figure imgf000033_0004
M,1 A') + AH ύT('2 l)> A A'
1=1
(86)
DD' yιy,'-(J!rC' + CJτ) CHτ {0)σ (87)
Figure imgf000033_0005
where for M = 1,2,3, Hτ lM) £ Kmxm denotes the matrix with elements Hl M = E{H^' {M)\yτ}, i,) £ {1. ,m} Also jτ £ Rmxd denotes the matrix with elements 3ψ = E{Jψ\yτ), i = 1,. , m n = 1, , d The terms Hlj. and 7" are computed using (47) (48) together with the filters in Theorem 74 Thus we have a filter-based EM algorithm
Example: Errors in Variables Time Series
We use the filtered EM algorithm to estimate the parameters of the the errors-ιn-varιables time series example considered in [2] and [3]
Consider the scalar valued AR{p) process sk , k £ 2+, defined as
Figure imgf000033_0006
1 = 1 where k is a white Gaussian process Assume that we observe s; indirect Iv via the scalar process
Vk sk 4-< α ~Λ'(0 σ- (89)
where c is a white Gaussian process independent of v The aim is to compute the ML estimate of the parameter vector θ = (<n, ,ap,σ ,σ )
We first re-express (88) and (89) in state space form (81), (82) with d = [, m = p + \, a = [α, ap]
Figure imgf000034_0001
Using a similar procedure as above, it can be shown that [2] that
Figure imgf000034_0002
^ logσ£ - - ∑E{(y, - 4,)2LVr} + E{R(θ)\yT} (90)
( ι=o where R(θ) does not involve θ So the M-step yields
Figure imgf000034_0005
Figure imgf000034_0003
ύ'j(0) Hτ • <», a, (92)
(93)
Figure imgf000034_0004
As shown in [2], Theorem 1 in [12] implies that the sequence of model estimates {0 }, j £ X+ from the EM algorithm are such that the sequence of likelihoods {£(#,)}, j £ 7i is monotonically increasing It is straightforward to compactify the Euclidean parameter space Θ Since C(θ) is continuous on this compact space it is bounded from above Therefore the sequence {£(#,)} converges to some £* Since the function Q(#, ;) is continuous both in θ and θ} , bv Theorem 2 in [13], £" ι^ Λ stationary value of £ To make sure that C is a maximum value of the likelihood, it is necessarv to try different initial values ΘQ Weak consistency of the maximum likelihood estimate is proved m [2] for this particular model (also see Chapter 4 in [16]) Similar parametrized models are used [1] and [6] and can be estimated ia our filter-based EM algorithm
9 Parallel Implementation of Filters
In this section we discuss the computational complexity of our filters and the resulting filter-based EM algorithm In particular, we describe why our filter based algorithm is suitable for parallel implementation and propose a systolic processor implementation of the algorithm
Sequential Complexity
We evaluate the computational cost and memory requirements of our filter-based algorithm and compare them with the standard smoother-based EM algorithm
Computational Cost The filter based E-step requires computation l each time : of E{H[3 ' \yk ) , M = 0, 1 2 and E{Jk'n \yk } for all pairs (ι, j)
a'k + ι Consider the RHS of the update equation (71 ) The following are the computational cost for each (i, ) pair at each time-instant k
2nd term 0(m) multiplications (inner product of two m-vectors)
3rd term 0(m2) multiplications (matrix vector multiplication)
4th term 0(m2) multiplications (matrix vector multiplication)
Since there are m2 (i, j) pairs, the total complexity at each time instant is 0(m )
• Similarly the total complexity for evaluating 6'^ ' for all 2 (t, j) pairs is 0(m4) multiplications
• Evaluating d^j for each (ι,j) pair requires m x m matrix-matrix multiplications This involves 0(m3) complexity So the total complexity for all m2 (ι, j) pairs is 0(m5) at each time instant
In comparison, the Kalman smoother based E-step in [2], [3]), requires 0(m3) complexitv at each time instant to compute E{H (-M l\yτ ) , M = 0. 1 , 2, and E{J n| τ} for all pairs (j j) Memory Requirements In our filter-based EM algorithm only the filtered variables at each time instant need to be stored to compute the variables at the next time instant They can then be discarded Hence the memory required is independent of the number of observations T In comparison, the Kalman smoot her-based EM scheme in [2] , [3] requires 0( m3 T) memorv since all the Kalman filter covariance matrices Rk . k < T need to be stored before smoot hed cov ariance matrices can be computed, see Eq.(2.12) in [2] This also involves significant memory read-write overhead costs.
Parallel Implementation on Systolic Array Architecture
The following properties of our algorithm make it suitable for vector-processor or systolic-processor implementation:
1. The computation of a 3 (
Figure imgf000036_0001
3 '( t and for any other pair (i', /) for all time k = 0, 1 , . . . So all the i j components of these variables can be computed m parallel on m2 processors
Similarly computation of all (J, n ) components of dj." and δj," are mutually independent and can be done in parallel.
2. The recursions for ak , bk ', d)k , and ά'k n do not explicitly involve the observations They only involves the Kalman filter variables, μk- ι , ck , Rk\k ~ ι Moreover, dk only involves Rk (and so Rk\k- ι) which itself is independent of the observations and can be computed off-line for a given parameter set θ.
All the processor blocks used above are required to do a synchronous matrix vector multiplication at every time instant k. Now a N x ;V matrix can be multiplied by a N vector in Λ' time units on a N processor systolic array, see [17] , pp 216-220 for details. (Also it can also be done in unit time on N2 processors).
If r is the time required for tins matrix-vector multiplication, then for a T-point data sequence, our filtered algorithm requires a total of T T time units per EM iteration. In comparison, a parallel implementation of the forward-backward smoother-based EM algorithm requires 2 τ T time units per EM iteration because we need a minimum of T T time units to compute the forward variables and another T T units for the backward variables. For large T and a large number of EM iterations, this saving in time is quite considerable.
In addition, unlike our filtering method which has negligible memory requirements, the forward backward algorithm requires significant memory read-write overhead costs requiring πr T memory locations to be accessed for the stored forward variables while computing the backward variables
Finally, we may add that our algorithm can be easily implemented in a Smgle Instruction Multiple Data (SIMD) mode on a supercomputer in the vectorization mode or the Connection Machine using FORTRAN 8X Tv picalh w ith m
Figure imgf000037_0001
vector multiplications per time instant That is w e need a total of 10000 processor units on a C onnect ion Machine w hich typically has 216 = 65536 processors
10 Conclusions and Future work
We have presented a new class of finite dimensional filters for linear Gauss Markov models that includes the Kalman filter as a special case These filters were then used to derive a filter based Expectation Maximization algorithm for computing maximum likelihood estimates of the parameters It is possible to derive the new filters in continuous-time using similar techniques This is the subject of a companion paper [18] \lso in practical applications it is worthwhile developing risk sensitive versions of the filter i e where the estimates are computed to ιnιnιmi7e an exponential cost function rather than a quadratic cost function
References
[1] R H Shumway and D S Stoffer, A n approach to time series smoothing and forecasting using the EM algorithm J Time Series Analysis, Vol 3, No 4, pp 253-264 1982
[2] D Ghosh, Maximum Likelihood Estimation of the Dynamic Shock-Error Model, Journal of Econometrics Vol 41 No 1, pp 121-143, May 1989
[3] E Weinstein, A V Oppenheim M Feder and J R Buck Iterative and Sequential Algorithms for Multisensor Signal Enhancement, IEEE Signal Proc Vol 42, No 4 pp 846-859 April 1994
[4] V Kπshnamurthy, On-line estimation of Dynamic Shock-Error Models, IEEE Trans Auto Control, Vol 35 No 5, pp 1129- 1134, 1994
[5] N S Jayant and P Noll Digital Coding of Waveforms Prentice Hall 1984
[6] I Ziskind and D Hertz, Maximum Likelihood Localization of Narrow-Band λutoregressive Sources via the EM Algorithm IEEE Trans Signal Processing, V ol 41 No 8, pp 2719-2723, August 1993
[7] L Breiman, Probability Classics in Applied Math Vol 7 SIAM Philadelphia 1992
[8] R J Elliott 1 inr t Adaptive r lers for Markov Chains Observed in ( i us i n Λ oist Vutomat- ica \ ol 0, No 9 pp 1399-1408 September 1994 [9] L ggoun R J Elliott and J B Moore 1 Measure C hang De nt ation oj ( ontinuou s tatc
Baum- W elch Estimators 1995
[10] R J Elliott, L Aggoun and J B Moore Hidden Markov Models Estimation and Control Springer Verlag, 1995
[11] A Papou s, Probability, Random Variables and Stochastic Processes McGraw Hill 1984
[12] A P Dempster, N M Laird and D B Rubin, Maximum Likelihood from Incomplete Data via the E M algorithm Journal of the Royal Statistical Society, B '19 pp 1-3S 1977
[13] C F J Wu, On the convergence properties of the EM algorithm The Annals of Statistics 11 , pp 95- 103 1983
[14] A H jazwmski, Stochastic Processes and Filtering Theory Academic Press 1970
[15] P Bilhngsley, Probability and Measure, Second Edition 1986 John W ilev NY
[16] E J Hannan and M Deistler, The Statistical Theory of Linear Systems Wiley, 1988
[17] E \ Kπshiiamurthy, Parallel Processing Principles and Practice Addison Wesle> , 1989
[18] R J Elliott and V Kπshnamurthy, New Finite Dimensional Filters for Estimation of Continuous-time Linear Gaussian Systems, submitted to SIAM Journal on Control and Optimization, 1995
A Derivation of Q(θ ) in (83)
Consider the time invariant state space model given by (81), (82) with θ — (A, B, C, D) denoting a possible set of parameters
We have shown in Sec 3 how starting from a measure P under w hich the xj and vi are independent and normal we can construct the measure P = P(θ), such that under P = P(θ), the x and y sequences satisfy the dynamics (1) and (2) In fact dP(θ) \
= * (*) dP
Suppose a — ( \ , B, C D) is a second set of parameters 'I hen
Figure imgf000038_0001
To change from, say. one set of parameters 0 lo " w i nust introduce the densities
•U((>) ,=n vhere
\D\ { - {yo Cxo)) \B\ Ψ [B-l{xι-Ax,.ι)) \D\ Φ(D-l{y, - x,))
;o ~ll =
M ( D- (yo C Xθ \B\ B- (XI iti-i)) | | ^(β-^W-C ,
The parameters of our model will be changed from θ to θ if we set dP(θ)\
= rk(β,β) dP(θ) Cϊi
In this case, dP(θ) log -k \og\B\ -{k+l) log\D\- -∑(x,- Ax,-ι)'B-2(x,- Ax,_{) dP(θ) 1 = 1
1 * - ∑(y, - C x,)' D-2[yι - C x,) + R(θ)
1=0 where R(θ) does not involve any of the parameters θ.
Then evaluating Q(θ,θ) = E{log - d^(θ )| V) for a fixed positive integer T yields (83)
Appendix B
%%%%%%%%%%%%%%%%
%% Matlab source code implementation of Filter-based EM algorithm for
%% multi-sensor signal enhancement
%%%%%%%%%%%%%%%%
%% Important variables
%% most variables are named the same as in the patent document
%% Exceptions are:
%% es :parameter estimates
%%
%%%%%%%%%%%%%%%%
%% External Functions used (source code listed below main program)
%% R.m, R_set.m, a.m, a_set.m, b.m, b_set m, d.m, d_set.π, e.m, mu.m,
%% mu_set.m, x.m, x_set.m, y.m, y_set.m
%%%%%%%%%%%%%%%%%%
% data length global T T = 1000; % passes global PASSES PASSES = 10,
% AR coefficients arc = [ 81;
% order of AR = p p = length (arc) , global m = p+1;
% process noise std
S = .1 ;
% measurement noise std sv = .05;
A = [arc, 0;eye(p), zeros (p, 1)1
B = [s zeros (l,p); zeros (p,l), zeros (p,p)
BB = B*B' ;
C = [1, zeros (l,p); ones(l.p) 0]
D = sv*eye(2,2)
DD = D*D' ;
% initial x xO = [s *randn; zeros(p.l)];
% define x global x_ x_ = zeros (m, T) ;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % read data y load y;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% define Kalman Filter variables global mu_ R_ mu_ = zeros (m,T'),
R_ = zeros (m,T*m) ; % R_{ |k)
% a global a_ a_ = zeros (1, T*m*m) ;
% b global b_
D_ = zeros ( , T*m*m) ;
% d global d_ d_ = zeros (m, T*m*m*m) ,- A = [arc - (0 5-rand ( 1 , p) ) * 5 0 eye(p) zeros'p 1 est( . , 1) = Ail, l.p) ' for pass=l .PASSES pass
% Initialise Kalman Filter mu_set ( 1 , y ( 1 ) ) ,
R_set ( 1 , ones(m) + eye(m)*l)
% Initialise New Filter for ι=l -m for 3 = 1 m a_se (1, i, , 0) , end end for ι=l for 3=1 m b_set(l, l, ], zerosim,!)), end end for ι=l m for ]=1 m d_set(l _ j (e ( . ) 'e ( - > +e i - ) *e ' . ' ι _) enα end
% Kalman Filter & New tir.xte αimensional Filters for k=2 T % Eq 70
Rk = BB+A*R(k-ll *A'
% Eq 66
InvSig = Rik-1) -R(k-l) *A' 'mvlR l *A*R(k-l)
% Eq.70 mu_set (k, A*mu ( k- 1 ) +Rk*C ' *mv (C*Rk*C ' +DD) * (y (k) -C*A*mu (k-1 ) ) ) , R_set(k, Rk-Rk*C *ιnv(C*Rk*C'+DD) *C*Rk) ,
InvR = ιnv(Rik-l) ) , % Eq 71 or 1=1 -m for 3=ι m a_set (k,ι,3, a(k-l,ι,3)+b(k-l,ι,3)' * InvSig* InvR*mu (k-1 su ( sum (d + (k-l,ι, 3) "InvSig) ) +mu(k-l) ' 'InvR* InvSιg*d (k- 1 , i, ) *InvS g* InvR*mu (k-1 ) ) b_se (k, l, , mv(Rk) *A*R(k-l)*(b(k-l,ι,3)+2*d(k-l,ι,3) * InvSig* InvR + *mu(k- 1) ) 1 , d_set (k, 1,3, ιnv(Rk) *A*R(k-l) *d(k-l, ,3)*R(k-l)*A' 'ιnv(Rk) +1/2* (e( + ι)*e(3) '+e(3)*e(ι) ' ) ) ; end end end
H = zeros (m, m) , % Eq.47 for 1=1 :m for 3=1.m
H(ι,3)=a(T,ι,3)+b(T,ι,3) ' *mu(T) +sum( sum( ( , I, 3) . *R(T) ) )+mu(T) ' *d(T, 1 3) *"n u(T) end end for ι=l :m- 1 for 3 =1 +1 :m
H(3,ι) H(ι,3) end end
E = zeros (p, p) ; F = zeros (p, 1) ; for ι=l:p for 3=l:p
E(ι,3) = H(I+1, +1) ; end end for i=l:p
F(i,l) = H(l,i+1) ; end
G = ιnv(E) *F
A = [C, 0;eye(p), zeros(p.l)]
%%%%%%%%%%%%%%%%
%% est denotes the parameter estimates
%%%%%%%%%%%%%%% est ( : . pass+1) = G end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Matlab source code listing of functions used in main program
% for Filter-based EM algorithm
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% R.m
%%%%%%%%%%%%% function [y] = R(t) global R_ m; y = R_( : , (t-1) *m+l:t*m) ;
%%%%%%%%%%%%%
% R_set.m
%%%%%%%%%%%% function [y] = R_set(t,z) global R_ ;
R_( : , (t-1) *m+l: t*m)=z;
%%%%%%%%%%%%
% a.m
%%%%%%%%%%%% function [y] = alt, 1,3) glooal a_ m T; v = a_(l, (t-l)*m*m + (ι-l)*m + 3)
%%%%%%%%%%%% %
% a_set.m
%%%%%%%%%%%%% function [] = a_set ( t , 1 , 3 , z ) global a_ T m; a_(l, ( t-1) *m*m + (ι-l)*m + 3)=z;
%%%%%%%%%%%%%
% b.m
%%%%%%%%%%%% function [y) = b(t,ι,3) global b_ m T; y = b_( : , (t-1) *m*m + (i-l) *m + 3)
%%%%%%%%%%%
% b_set.m
%%%%%%%%%%% function [] = b_set ( , 1 , , z) global b_ T m; b_( : , (t-1) *m*m + (i-l) *m + 3 )=z;
%%%%%%%%%%
% d.m
%%%%%%%%%% function [y] = d(t,ι,3) global d_ m T; y = d_( : , (t-1) *m*m*m+(ι-l) *m*m+ (3-D *m+l: (t-1) *m*m*m+(ι-l) *m*m+3*m)
%%%%%%%%%% % d_set.m %%%%%%%%%% function [] = d_se ( t , 1 , 3 , z! global d_ T m; d_( . , (t-1) *m*m*m+ (i-l) *m*m+ (3-D *m+l. (t-1) *m*m*m+ (i-l) *rr*m+3 *m) =z ;
%%%%%%%%%% % e.m %%%%%%%%%% function [yj = e(ι) global m y = zeros ( , 1) ; yd) = 1;
%%%%%%%%%%%% % mu.m %%%%%%%%%%% function [y] = rau ( ) global mu_,- = mu_ ( : , t ) ;
%%%%%%%%%%% * nu_set.m function [] = mu_set(t,z) glooal ιπu_; r__!l . length (z) , ) =z;
%%%%%%%%%% Ϊ x.m function [y] = x(t) global x_; y = X_ ( : , t ) ;
%%%%%%%%%
% x_set.m
%%%%%%%%% function [y] = x_set(t,z) global x_; x_( : , )=z;
%%%%%%%%%%
* y.
%%%%%%%%%% function [z] = y(t) global y_; z = y_ ( : , ) ;
%%%%%%%%%%%
% y_set.m
%%%%%%%%%%% function [] = y_set(t,z) global y_; y_( : ,t)=z; APPENDIX C
The following model for the spot price of oil, S, was proposed by Eduardo Schwartz in his Presidential Address to the American Finance Association in January 1997 dSt = f i - δt)SfU +σ,Stdzt{t) (1 )
Here, Zi is a standard Brownian motion, and δt represents the "convenience yield" (which models the value of holdings amounts of the commodity). In fact this follows similar stochastic dynamics of the form
dδ, = κ(a -δ, )dt + σ2dz2 (/) (2)
Here, z2 is a second standard Brownian motion with <Zι(t), z2(t) = pt.
It is convenient to consider the logarithm of the stock price. x, = In St
Then x satisfies
Figure imgf000045_0001
If r is the risk-free interest rate (taken to be constant here) and λ is the market price of convenience yield risk (also assumed constant) S and δ follow similar processes under an equivalent Martingale measure.
However, it is equations (2) and (3) which give discrete dynamics for the state vector (xt, δt)1 as:
(*„<?,) = cf +Q(. ,A,)' + ^ W
Here
Figure imgf000045_0002
and
1 -At '
Q, = a 2x2 matrix 0 \ - κAt
The future price for oil delivery at time T>0 is given by: -
F(S,δ,T) 1
Figure imgf000046_0001
where
A(T) r -a +
Figure imgf000046_0002
Here: S is the spot price today, T=0 δ is the value of the convenience yield today, T=0.
Consequently;
\nF(S,δ,T)=\iΛS-δ^ e~ ^+A(T) (5)
It is these futures prices, for various dates, T, which are given in the market. That is for different dates Tι, T2, ..., TN we have observations
Figure imgf000046_0003
y?=\nF(S„δt +TN)
These observations give the right hand side of (5), plus some "noise" term εe9ϊN where εt =(ε\ ,ε2 t,...,εN t) is a sequence t=0,1,2 of independent
Gaussian random variables with
E[ε,]=0 e<RN and varε,= E[εtε't]=H e9ϊ ,NxN
The observation equation (5) plus εt noise on the right side therefore has the form:
y, =d, + Z,(x„δ,y+ε, for t=1,2, ...,T (6)
where
Figure imgf000046_0004
are the future prices at time t, for delivery at times t+T, t+T2, .... t+TN.
Figure imgf000047_0001
The model in summary has dynamics (4) for the "signal" (x,, δt)
(x,A) = c, +Q(x,_ δl_iy + η, (7)
and dynamics (5) for the observations
y, = d + Z(x„διγ + ε, (8)
In spite of Schwartz's notation; c, Q, d and Z do not depend on t. They do include Δt, the time increment of fixed size.
Equations (7) and (8) are of the form where the Kalman filter can be applied. This considers linear dynamics for the signal
A e tø" x/+1 = A + Ax, + Bω [ (9)
A e m" and observations
y^ C + Cx. + Dv, , t=O, 1 , ... (10)
One observes y , t=0, 1 , ... , T, ... and wishes to make the best estimate of xt. This is the quantity
x, = £[x,|y0,... y,
In fact , is also Gaussian random variable with conditional mean
Figure imgf000047_0002
and variance
Figure imgf000047_0003
In fact, the formulas are better written in terms of the one-step predictions:
Figure imgf000047_0004
and
R* -ι = £[(*. - _-ι_-ι)(** -
Figure imgf000048_0001
Then
R^^B'+A^A'
The Kalman filter then gives recursive up-dates:
μk+
Figure imgf000048_0002
+ DD'Jl(k+, -C-CA- CAμk (11 ) k+ι = R*+)ι* -Rk+ι\kC'\CRk+ilkC' +DD') CRk+lik (12)
As stated, μk = E[xk| y0) ... , yk] is the conditional mean, or best estimate, of xk given y0l ... , yk . Similarly, Rk = E[(xk- μk)(xk- μk)' | y0 yk].
However, to implement the Kalman filter knowledge of the parameters A,A,B,C,C,D is required.
The filter-based EM process, using the finite-dimensional filter, can advantageously be used to estimate these parameters, as set out below.
Consider the following recursions for l ≤ij≤ a3 ,bk ,d3 ,ak,bk , k,vk,uk,vk \<n≤d
A = 0,1,2
For M = 0,1:
a
Figure imgf000048_0003
-b?M)Rk kARkA+A ltkARk ? RkA kA αv(M = 0 e ^
btX = Il ARt(b + 2d Λ] k-2d RkA'KlkA A-;(θ) = 0 s K-
« = Rk ,]kARkd 0)RkA'Rl k + (ele +eJel 1
Figure imgf000049_0002
Figure imgf000049_0001
Given the following:
Figure imgf000050_0001
The expectation values are defined as:
Hr
Figure imgf000050_0002
, etc...
Then for M=0, 1,2:
Figure imgf000050_0003
£p* ]= ύk + vkk
These equations represent the recursive finite dimensional filters for estimating the matrices and vectors H* , M=0, 1,2, Jk, Lk, and Lk given the observations
The revised estimates for the parameters A,B,C,D,~A,C are then (once given y0l ... , yτ):
Figure imgf000050_0004
- t- Lr) Cτ = τ-TCχXη]
Figure imgf000050_0005
Given observations y0, ... , yτ the parameters are initialized and the above used to re-estimate the parameters one at a time. With the same data y0, ... , yτ this process is iterated until some convergence is observed.

Claims

CLAIMS:
1. A signal processing method for determining parameters of a linear Gaussian system to remove a Gaussian noise component, comprising: processing received data by a finite-dimensional filter to obtain expectation data; processing said expectation data to derive parameter data representative of said parameters.
2. A signal processing method as claimed in claim 1 , wherein said expectation data represents the sum over time of the state of a signal defined by said linear Gaussian system and the sum over time of the covariance of the state of said signal.
3. A signal processing method as claimed in claim 2, wherein said received data is time series data.
4. A signal processing method as claimed in claim 3, wherein said received data processing step is executed iteratively on said time series data in one direction in time.
5. A signal processing method as claimed in claim 4, wherein said direction comprises a forward pass of said time series data.
6. A signal processing method as claimed in claim 5, wherein said processing steps are executed in real time as said time series data is received, and said parameter data is updated recursively by said expectation data processing step
7. A signal processing method as claimed in claim 1 or 5, wherein said finite dimensional filter obtains mean and covariance state data for said signal using Kalman filtering, and generates in parallel a plurality of coefficient data from said mean and covariance state data, and said coefficient data is used to generate said expectation data.
8. A finite-dimensional filter for processing received data to obtain state data for deriving parameter data of a linear Gaussian system associated with said received data.
9. A finite-dimensional filter as claimed in claim 8, including:
5 a plurality of finite-dimensional filters for generating coefficient data in parallel; and a module for generating said state data from said coefficient datas said state data representing the sum over time of the state of a signal defined by said linear Gaussian system and the sum over time of the covariance of the state of said signal.
10 10. A finite-dimensional filter as claimed in claim 9, wherein said received data is time series data.
11. A finite-dimensional filter as claimed in claim 10, wherein said time series data is processed iteratively and in one direction in time.
15
12. A finite-dimensional filter as claimed in claim 11 , wherein said direction comprises a forward pass of said time series data.
13. A finite-dimensional filter as claimed in claim 12, wherein said received data which 0 includes mean and covariance state data obtained from a Kalman filter which operates on received data representing said signal.
14. A finite-dimensional filter as claimed in claim 13, wherein said received data is processed in real time. 5
15. A finite-dimensional filter as claimed in claim 14, wherein said state data is used by an expectation and maximisation process to derive said parameter data which is representative of the parameters of said Gaussian system.
16. A speech signal processing method for determining parameters of a linear Gaussian system representing speech received by at least two sensors, comprising: processing the signals received by the sensors using an expectation and maximisation process to obtain estimates of the parameters representing the system, said process including using a finite-dimensional filter to obtain state data representative of the speech signal, said state data being used to derive the parameter estimates.
17. A signal processing method for determining parameters of a plurality of narrowband signal sources, comprising: receiving signals from said sources at a central location; processing signals using a finite-dimensional filter to obtain expectation data representative of the state of a signal for each source; and processing said expectation data to derive parameter data representative of said parameters, including signal direction of arrival and signal bandwidth for each source.
18. A signal processing method for determining parameters of a linear Gaussian system representing a financial system for generating predictive financial data, comprising: processing observed financial data for the financial system using an expectation and maximisation process to obtain estimates of the parameters representing the system, said process including using a finite-dimensional filter to obtain state data representative of the financial data, said state data being used to derive the parameter estimates.
PCT/AU1997/000519 1996-08-16 1997-08-15 Signal processing method using a finite-dimensional filter WO1998008167A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU37627/97A AU3762797A (en) 1996-08-16 1997-08-15 Signal processing method using a finite-dimensional filter

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AUPO1701 1996-08-16
AUPO1701A AUPO170196A0 (en) 1996-08-16 1996-08-16 A finite-dimensional filter

Publications (1)

Publication Number Publication Date
WO1998008167A1 true WO1998008167A1 (en) 1998-02-26

Family

ID=3796014

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU1997/000519 WO1998008167A1 (en) 1996-08-16 1997-08-15 Signal processing method using a finite-dimensional filter

Country Status (2)

Country Link
AU (1) AUPO170196A0 (en)
WO (1) WO1998008167A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2363557A (en) * 2000-06-16 2001-12-19 At & T Lab Cambridge Ltd Method of extracting a signal from a contaminated signal

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0216580A2 (en) * 1985-09-25 1987-04-01 Sony Corporation Infinite impulse response filters
GB2253081A (en) * 1991-02-20 1992-08-26 Toshiba Kk Security-exchange decision-making support
DE4220524A1 (en) * 1992-06-23 1992-10-22 Matzner Rolf Dipl Ing Separate estimation of power in two superimposed stochastic processes - by sampling and filtering to identify inputs for processing to identify separate signal and noise components
US5187673A (en) * 1991-02-05 1993-02-16 Edward L. Carver, Jr. Method and apparatus for determining the distribution of constituent subpopulations within a population of particles having overlapping subpopulations
EP0540216A1 (en) * 1991-11-01 1993-05-05 Gec-Marconi Limited Filter
WO1994028542A1 (en) * 1993-05-26 1994-12-08 Telefonaktiebolaget Lm Ericsson Discriminating between stationary and non-stationary signals
EP0660301A1 (en) * 1993-12-20 1995-06-28 Hughes Aircraft Company Removal of swirl artifacts from celp based speech coders

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0216580A2 (en) * 1985-09-25 1987-04-01 Sony Corporation Infinite impulse response filters
US5187673A (en) * 1991-02-05 1993-02-16 Edward L. Carver, Jr. Method and apparatus for determining the distribution of constituent subpopulations within a population of particles having overlapping subpopulations
GB2253081A (en) * 1991-02-20 1992-08-26 Toshiba Kk Security-exchange decision-making support
EP0540216A1 (en) * 1991-11-01 1993-05-05 Gec-Marconi Limited Filter
DE4220524A1 (en) * 1992-06-23 1992-10-22 Matzner Rolf Dipl Ing Separate estimation of power in two superimposed stochastic processes - by sampling and filtering to identify inputs for processing to identify separate signal and noise components
WO1994028542A1 (en) * 1993-05-26 1994-12-08 Telefonaktiebolaget Lm Ericsson Discriminating between stationary and non-stationary signals
EP0660301A1 (en) * 1993-12-20 1995-06-28 Hughes Aircraft Company Removal of swirl artifacts from celp based speech coders

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
IEEE TRANSACTIONS ON SIGNAL PROCESSING, Vol. 43, No. 2, February 1995, HAYKIN and LIANG LI, "Nonlinear Adaptive Prediction of Nonstationary Signals", pages 526-535. *
IEEE TRANSACTIONS ON SIGNAL PROCESSING, Vol. 44, No. 3, March 1996, BISHOP and DJURIC, "Model Order Selection of Damped Sinusoids in Noise by Predictive Densities", pages 611-619. *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2363557A (en) * 2000-06-16 2001-12-19 At & T Lab Cambridge Ltd Method of extracting a signal from a contaminated signal

Also Published As

Publication number Publication date
AUPO170196A0 (en) 1996-09-12

Similar Documents

Publication Publication Date Title
Elliott et al. New finite-dimensional filters for parameter estimation of discrete-time linear Gaussian models
Xia et al. Adaptive fading Kalman filter with an application
US5812992A (en) Method and system for training a neural network with adaptive weight updating and adaptive pruning in principal component space
Paleologu et al. Adaptive filtering for the identification of bilinear forms
Sulyman et al. Convergence and steady-state analysis of a variable step-size NLMS algorithm
Ding et al. Convergence analysis of estimation algorithms for dual-rate stochastic systems
US5987444A (en) Robust neutral systems
WO2007053831A2 (en) Optimum nonlinear correntropy filter
An et al. Maximum likelihood based multi‐innovation stochastic gradient identification algorithms for bilinear stochastic systems with ARMA noise
Herrera et al. Estimating full lipschitz constants of deep neural networks
Andrianova et al. Robust performance analysis of linear discrete-time systems in presence of colored noise
US6487524B1 (en) Methods and apparatus for designing a system using the tensor convolution block toeplitz-preconditioned conjugate gradient (TCBT-PCG) method
Oku et al. A recursive 4SID from the input‐output point of view
WO1998008167A1 (en) Signal processing method using a finite-dimensional filter
Ishida et al. Multikernel adaptive filters with multiple dictionaries and regularization
CN114614797B (en) Adaptive filtering method and system based on generalized maximum asymmetric correlation entropy criterion
Rontogiannis et al. On inverse factorization adaptive least-squares algorithms
Glowinski et al. Fast operator-splitting algorithms for variational imaging models: Some recent developments
Tandon et al. Partial-Update ${L} _ {\infty} $-Norm Based Algorithms
Elliott et al. On‐line almost‐sure parameter estimation for partially observed discrete‐time linear systems with known noise characteristics
Nerrand et al. Neural network training schemes for non-linear adaptive filtering and modelling.
Thomas et al. Generalized swept approximate message passing based kalman filtering for dynamic sparse Bayesian learning
KRZYŻAK et al. On identification of block orientated systems by non-parametric techniques
Ding et al. Identification of multivariable systems based on finite impulse response models with flexible orders
Billings et al. Rational model data smoothers and identification algorithms

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW AM AZ BY KG KZ MD RU TJ TM

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH KE LS MW SD SZ UG ZW AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 1998510168

Format of ref document f/p: F

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase