WO1999023576A1

WO1999023576A1 - Method and device for classification of a first time sequence and a second time sequence

Info

Publication number: WO1999023576A1
Application number: PCT/DE1998/003184
Authority: WO
Inventors: Gustavo Deco; Christian Schittenkopf; Rosaria Silipo
Original assignee: Siemens Aktiengesellschaft
Priority date: 1997-11-04
Filing date: 1998-10-30
Publication date: 1999-05-14
Also published as: JP2001522094A; EP1027663A1

Abstract

Time sequences are examined with respect to their statistical dependence and neuronal networks are trained in such a way that time sequence probability densities caused by a each neuronal network are modelled. Neuronal networks are used to determine surrogate time sequences and variables are determined for the statistical dependence of time sequence samples and surrogate time sequence samples. Said variables are compared with each other and the dynamic performance of each basic time sequence is examined as to whether it is described by a Markhov process in the order of (n1,..., nN).

Description

description

Method and device for classifying a first time series and at least a second time series

The invention relates to the classification of a first time series and at least one second time series.

In a wide variety of applications, it is helpful to analyze dynamic systems with regard to their statistical dependency in order to predict the course of a measurement signal.

A given measurement signal x has any number of samples xj-, which are sampled with a step size w (see FIG. 2). It is important to determine linear and nonlinear statistical dependencies between the sample values x- _j -. Depending on a predetermined number of samples v in the past, which are analyzed with regard to their statistical dependency, the information obtained by the analysis is used to predict a number of values z in the future.

In the 1-dimensional case, i.e. for a time series (), the concept of information flow from [1] and [3] is known for the analysis of the statistical dependency. In the case of such a method, the loss of information between past values and values that lie a predeterminable number of steps in the future is determined. Depending on the course of the information flow, the statistical dependence of the sample values and the values that lie in the future can be concluded.

The determination of a surrogate time series for a given time series, ie a number of samples of a signal, and the basics of surrogates can be found in [2] and [6]. A surrogate for a given time series is to be understood as a time series that has certain statistical properties that are the same as the given time series.

An overview of various statistical estimators, for example neural networks, kernel estimators, etc. can be found in [4] and [8].

Below a column is a coefficient of the Taylor expansion of the logarithmic function of the Fourier

To understand transforms of a probability density. Basics about accumulators can be found in [5]. The expansion of characteristic functions in cumulants is described in [5].

[6] describes the training of a neural network according to the maximum likelihood principle.

In the following, a Markov process of order m is to be understood as a time-discrete random process in which a future value depends only on the values that lie m steps in the past.

The rank of a time series is further understood to mean the order of the samples of a time series according to the size of the samples.

A method for classifying a time series is known from [9], in which a predeterminable number of surrogates is determined for the time series. For the time series and for the surrogates, non-linear correlations between the values of the time series and the values of the surrogates are determined using a culinary-based method. The time series is classified based on the non-linear correlations.

Another method for classifying a time series is known from [10]. This process becomes dynamic System modeled according to their probability density. A neural network is trained according to the probabilities of a non-linear Markov process of order m according to the maximum likelihood principle.

The invention is based on the problem of creating a method and a device with which a classification of a plurality of time series with regard to their statistical dependency of the samples is possible.

The problem is solved by the method according to claim 1 and by the device according to claim 11.

In the method, a nonlinear Markov process is modeled for a first time series by a first statistical estimator. A nonlinear Markov process for the second time series is modeled by a second statistical estimator. At least one surrogate time series is formed for the first time series using the first statistical estimator. At least one surrogate time series is formed for the second time series using the second statistical estimator. A first measure for the statistical dependency of the samples of the first time series and the samples of the second time series is formed for a predetermined number of future samples. Furthermore, a second measure for the statistical dependence of the values of the surrogate time series on one another is formed for a predetermined number of samples lying in the future. A difference measure is formed from the first measure and the second measure. The classification is such that

if the difference measure is smaller than a predetermined threshold value, the first time series and the second time series are assigned to a first group,

- Otherwise the first time series and the second time series are assigned to a second group. The device has a processor unit which is set up in such a way that a non-linear Markov process is modeled for a first time series by a first statistical estimator. A nonlinear Markov process for the second time series is replaced by a second statistical

Model estimator. At least one surrogate time series is formed for the first time series using the first statistical estimator. At least one surrogate time series is formed for the second time series using the second statistical estimator. A first measure for the statistical dependency of the samples of the first time series and the samples of the second time series is formed for a predetermined number of future samples. Furthermore, a second measure for the statistical dependence of the values of the surrogate time series on one another is formed for a predetermined number of samples lying in the future. A difference measure is formed from the first measure and the second measure. The classification is such that

- Otherwise the first time series and the second time series are assigned to a second group.

The invention makes it possible for the first time to establish statistical dependency between multidimensional time series, i.e. several time series.

Further developments of the invention result from the dependent claims.

In a further development, it is advantageous to use a nonlinear neural network as a statistical estimator, since a neural network is very well suited for estimating probability densities. The invention can be used in various fields of application. Statistical dependencies between measured signals of an electroencephalogram (EEG) or an electrocardiogram (EKG) can be determined.

The invention can also be used very advantageously for analyzing a financial market, the course of the signal in this case, for example, describing the course of a share or a foreign exchange rate.

An embodiment of the invention is shown in the figures and is explained in more detail below.

FIG. 1 shows a sketch in which the invention is shown in its individual elements; Figure 2 is a sketch showing the course of a measurement signal f, which is converted into a time series {x} by sampling with a step size w; Figure 3 is a sketch showing a computer with which the invention is carried out.

A first time series {xt} ^unc ^ a second time series {yt} each have a predeterminable number of samples x- _j -, y ^ of a signal, in particular an electrical signal.

The signals are measured by a measuring device MG (see FIG. 3) and fed to a computer R. The method described below is carried out in the computer and the results are fed to a means for further processing WV.

For each time series {xt} '{^ t} a nonlinear Markov process of the order n _x , ny is modeled using a nneuurroonnaalleenn NNeettzzeess NNNN _XX ,, NNNNyy With the non-linear Markov process of order m, the information flow of the first time series {t} and the second time series {yt} is approximated.

For any number N of time series that can be taken into account in the method, a time series is identified by the following designation:

{x _t } _k , k = 1, ..., N (1).

For each time series {xtju ^e i- ⁿ measure of the statistical

Dependence of a value {xt + r ^ k (ei ⁿ value r steps in the future is located) is formed. Here nj past samples (j = 1, ..., N) are taken into account.

The characteristic of the system is expressed by the following difference between probability density functions p ():

p ({ ^x t} ^■■■ >

{t + r} _k J

- p ({xt} ..., jxt-n ₁ + l} ₁ , {xt} ₂ '-' { ^x t-n2 + l} ₂ '-' { ^x t} _N , - '{ ^x tn _N + l} 'p ({t + r} _k )

(2).

In the event that the statistical dependencies disappear within r steps in the future, regulation (2) results in:

p ({xt} ^{■ ••} . {xt- _nι + 1} ^ {t} ₂ <-> { ^x tn ₂ + l} _2. -. {t} _N '•••. j ^x tn _N + l } _N > {t + r} J

= p ({ ^x t} ^•• - { ^x tn ₁ + l} ₁ '{t} ₂ ' - '{ ^x tn ₂ + l} ₂ ---- { ^x t} _N - j ^x tn _N + l} _N Y ^• p ({t + r} _k )

(3).

Each neural network NN _X , NNy is trained to approximate a non-linear Markov process of order n _x , ny using the maximum likelihood principle with which the learning rule is followed to maximize the product of the probabilities. The respective neural network NN _X , NNy is thus intended to estimate the conditional probability

px _t xt-1 '* t- -n _x ' Yt-1 'Υt-n y>

P Yt! X _t _ !, ..., xt-n _x 'Yt-1' ••• 'yt-riy

carry out. To perform the analysis, higher order cumulants are used instead of the probability densities. By

Transformation of the rule (3) into the Fourier space become characteristic functions

, Φ ₂ k, Φ3k formed which the

Represent Fourier transforms of the probability densities of regulation (3).

Regulation (3) becomes:

In Φ = In Φj + In Φ '

being with the +1 dimensional vector

' ^{, r} = ({t- _nι +1} ^ ••• (t} j tn ₂ + l} _2. -. {t} _2' - / { ^x tn _N + l} _N , -, {t} _N , ( ^x t + r} _k )

and

N

∑ n: 7) k = l

applies

^φ ϊ = ^φ ϊ ( ^κ ϊ'- ' ^κ s' ^κ s + ι) = J ^eχ P

k, r _τ k, rk, r dv _r k _s , 'rd,

^• Pl v-, v _c V s + 1. dv v _s k, ₊ r _] _

Φ, Φi: ^k ₊₁ ) = i expM vk, rs + 1 K s + 1 k, rl ^v s + 1, dv s + 1

10).

Φ denotes the Fourier transform of the probability densities and K_ denotes the variables of the function Φl ..., K ■, ... I in Fourier space. The following applies: i = V-ϊ.

If one expands the characteristic functions in cumulants, which are described in [5], it follows from this

Φi: i2:

il, ..., iή = s + 1: i3;

The respective cumulant is designated with (.). Inserting the extension into regulation (5) results in the following condition:

with the limitation

B s (Ξi _a : i _a = s + 1 Λ ^ Vi _a : i _a = s + l) (15).

This condition applies in the event that all kinds of statistical dependencies become 0 within r steps m of the future.

In the case of statistical independence between the sample values, regulation (14) can be simplified to the following regulation:

(k). = 0 V i ₁ , ..., i _i e (l, ..., s + 1} (16).

'll-.ij ^J

A measure is formed for the statistical dependency between the sample values v-, ' ^r , ..., v', ^r -, of the respective time series.

The statistical dependency of the samples of the time series {x ^} _k , which are r steps in the future, is given according to the following rule:

The statistical dependency of samples of different time series {xt} ^ on each other is described according to the following rule:

The measure mj (r) represents a cumulative-based characterization of the information flow of the dynamic system from which the time series {xt} _{] are} determined, and quantifies the dependencies of the values {xt + r ^ kk ^e J - taking nj values into account Past time series {x ^ -} ..

In the event that a further dimension m ^^ fr) assumes the value 0, the samples of the respective time series are statistically independent of one another. Growing positive values of the respective further measure m ^ _j ^ r) indicate an increasing statistical dependency among the samples of the respective time series

K! » on.

^m kk ( ^r ) = X (k)? • + K (k)? Pl 'in = s + 1

with k = 1, N,

k-1

j = l

k and U (k) =: 2o; j = l

To model the Markov process of order (n_, ..., ΠN), a 2-layer, forward-looking neural network is trained to measure the probability densities p (+ l} _k {* t} •••, {x - _nι + l} ^ •••>} _N > ••• * { ^x tn _N + l} _N

to approximate.

According to the following regulations, a neural network is trained for each time series {xt} _{ι <} - in such a way that the neural network performs an estimate of the respective probability density of the Markov process of order (n] _, ..., ΠN):

H

Σ '21 h = l

^p K + l} _k M •••. { ^x _t - _nι

•••>) ■ ^■■ '( ^x tn _N + l}

k 1, N (22)

uv hO ⁺

.23:

(24;

: 25:

With H a number of Gaussian functions, with 1 the number of hidden neurons in the respective neural network and with v _hi,, v _hi,, w * the para

of the respective neural network. L (d) is defined according to regulation (19). The conditional probability density is therefore described by a weighted sum of normal distributions, their weights u _h , mean values

μ _h k and variances

as output variables of three different neural networks, which are multilayer

Perceptron (cf. regulations (23) to (25)) receive the first s- k r components of the vector v 'as an s-dimensional input variable. The training takes place according to the maximum likelihood principle, which is described in [7].

N neural networks are trained and N conditional probability densities for an N-dimensional Markov process of order (n] _, ..., n ^) are estimated.

After completing the training, the neural networks are able to generate new time series using the Markov processes of the original time series {x ^ l, starting with the first n ^ values of each time series {x ^},.

The first s values {xι} •••, {x _nι }, •••, { ^x l} _N '••• / { ^x n _N } _N J are fed to the neural network, which each simulates a conditional probability density. According to the so-called Monte Carlo method, new values {xi} | _{= 1 N} formed

according to the respective probability densities and these Values are used as new values of a new time series {xt}.

A new s-tuple of input values

{x ₂ } ..., {x _nι } _± {xi} j, • • •, {x} _N , -. { ^x n _N } _N > { ^x l} N

is in turn fed to the respective neural network. This procedure for the feedback of newly generated values is repeated as often as required. The new time series (xt) are processed in such a way that they have the same distribution as the original time series {t} _ι , ^and thus form d

Surrogate time series {t} _> k = 1, ..., N. By repeating this iteration, a predefinable, arbitrary number M surrogate time series {t} ■ / k = 1, ..., N, i = 1, ..., M, is thus generated (step 101).

The time series {xt} ^ ^and the surrogate time series {xt} • are subjected to a statistical test, which is described in [6], (step 102).

The null hypothesis defines that the respective dynamic system can be described with sufficient accuracy by an s-dimensional Markov process of order (n, ..., njj).

The so-called Student t test is used to test the null hypothesis. In the Student t test, a surrogate measure m. (r) determined. The determination of the surrogate measures τ ■ (r) is carried out in the same way as the determination of the measure m _k (r) for the statistical dependency for the time series {x ^},. The dependence on r results from the last component of the vector v '. The statistical dependencies between the last n _k values of each time series {xt} _k and the value that lies r steps in the future are measured.

From the surrogate measurements mr. (r) Surrogate mean values μ ^ (r) and surrogate standard deviations σ (r) are determined according to the following regulations:

The Student t test is carried out in accordance with the following regulation:

^κ VM + 1 σr (r)

By forming a test value t _k (r), a value is determined by which the surrogate time series are compared with the measure m _k (r) of the original time series {xt} _k . The value is also referred to as significance t _k (r). If the significance t _k (r) for all r, for example 1 <r <10, is less than 1.833 (M = 10), the null hypothesis is accepted.

The null hypothesis is clearly:

The time series {xt} _k can be described by a multi-dimensional Markov process, in which the last n_ values of the time series {x ^) ι,. , , and the last n ^ values of the time series {xtN are taken into account. If the null hypothesis is accepted, the examined time series {xt} ^ ^a ^ ^{s become} a time series of a first group which can be described by the Markov process of the respective order (n] _, ..., jsj) , classified (step 103) and the process is ended.

If the null hypothesis is rejected, the order (ni, ..., ΠN) of the Markov process is increased and the process is repeated, starting with the training of the neural networks (step 104).

If there is no detailed knowledge of the dynamic system examined, the following start values are used: n_ = 1, nj = 0 for j = 2, ..., N.

The number of time series examined is arbitrary, with two time series only two neural networks are required.

The following publications have been cited in this document:

[1] C. Schittenkopf and G. Deco, Exploring and intrinsic information loss in single-Humped Maps by Refining multi-symbol partition, Physica D, 94, p ^'57-64, 1996

[2] J. Theiler et al, Testing for nonlinearity in time series: The method of Surrogate data, Physica D, 58, pp. 77-94, 1992

[3] G. Deco, C. Schittenkopf and B. Schürmann, Determining the Information Flow of Dynamical Systems from Continuous Probability Distributions, Physical Review Letters, 78, pp. 2345 - 2348, 1997

[4] B. W. Silverman, Density Estimation for Statistics and Data Analysis, Chapman and Hall, ISBN 0-412-24620-1, pp. 34-94, London, 1986

[5] C. Gardiner, Handbook of Stochastic Methods,

2nd edition, Springer Verlag, ISBN 0-387-11357-6,

Pp. 33-36, New York, 1985

[6] C. Schittenkopf and G. Deco, Testing Nonlinear Markovian

Hypotheses in Dynamical Systems, Physica D, 104,

Pp. 61-74, 1997

[7] A. Papoulis, Probability, Random Variables, and

Stochastic Processes, McGraw-Hill, New York, 1991, ISBN 0-07-100870-5, pp. 260-263

[8] CM Bishop, Neural Networks for Pattern Recognition, Clarendon Press, Oxford, pp. 33-50, ISBN 0-19-85364-2, [9] DE 196 08 734 Cl

[10] DE 196 43 918 Cl

Claims

claims

1. A computer for classifying a first time series and a second time series, each of which has a predetermined number of samples,

a non-linear Markov process for the first time series is modeled by a first statistical estimator,

a non-linear Markov process is modeled for the second time series by a second statistical estimator,

in which at least one surrogate time series is formed for the first time series using the first statistical estimator, in which at least one surrogate time series is formed for the second time series using the second statistical estimator,

a first measure for the statistical dependency of the samples of the first time series and the samples of the second time series is formed for a predetermined number of future samples,

a second measure of the statistical dependence of the values of the surrogate time series on one another is formed for a predetermined number of samples lying in the future,

- in which a difference is formed from the first dimension and the second dimension, and

- in which the classification is such that

2. The method according to claim 1, in which several second time series are taken into account.

3. The method of claim 1 or 2, in which a plurality of surrogate time series are formed for the first time series and the second time series using the first statistical estimator and the second statistical estimator.

4. The method according to any one of claims 1 to 3, in which a neural network is used as a statistical estimator.

5. The method according to any one of claims 1 to 4, in which instead of the first time series and / or the second time series and / or at least one further time series, a modified Gaussian time series, which is formed in that normal distributed random numbers are determined and this according to the rank of the time series are used.

6. The method according to any one of claims 1 to 5, wherein the first dimension is formed according to the following rule:

mkk (r) = N (k)? • + X ( ^k )? Jl ^» s + 1

with k = 1, N,

k-1

j = l

being with I, j, k are positive integers,

K cumulants are designated,

• nj a number of the used values of the

Time series {x} _'from the past,

referred to as.

7. The method according to any one of claims 1 to 6, in which in the event that the first time series and the second time series are assigned to the second group, the method is repeated iteratively until the first time series and the second time series are assigned to the first group the order of the Markov process is increased in each iteration.

8. The method according to any one of claims 1 to 7, in which the course of a financial market is described by the first time series and the second time series.

9. The method according to any one of claims 1 to 7, in which the course of an electroencephalogram is described by the first time series and the second time series.

10. The method according to any one of claims 1 to 7, in which the course of an electrocardiogram is described by the first time series and the second time series.

11. Device for classifying a first time series and at least one second time series, each of which has a predetermined number of samples, with a processor unit which is set up in such a way that

a nonlinear Markov process for the first time series is modeled by a first statistical estimator,

a nonlinear Markov process for the second time series is modeled by a second statistical estimator, at least one surrogate time series is formed for the first time series using the first statistical estimator,

at least one surrogate time series is formed for the second time series using the second statistical estimator,

a second measure for the statistical dependence of the values of the surrogate time series on one another is formed for a predetermined number of samples lying in the future,

- a difference measure is formed from the first measure and the second measure, and

- The classification takes place in such a way that

- if the difference is less than a predetermined threshold, the first time series and the second time series are assigned to a first group, - otherwise the first time series and the second time series are assigned to a second group.

12. The apparatus of claim 11, wherein the processor unit is set up such that further time series are taken into account.

13. The apparatus of claim 11 or 12, wherein the processor unit is set up such that a plurality of surrogate time series are formed for each of the first time series and the second time series using the statistical estimator.

14. The device according to any one of claims 11 to 13, wherein the processor unit is set up such that a neural network is used as a statistical estimator.

15. Device according to one of claims 11 to 14, wherein the processor unit is set up in such a way that instead of the first time series and / or the second time series and / or at least one further time series, a modified Gaussian time series which is formed by the fact that N normally distributed random numbers are determined and these are sorted according to the rank of the time series.

16. The device according to one of claims 11 to 15, wherein the first dimension is formed according to the following rule:

with k = 1, ..., N,

k-1 with L (k) = ∑ nj j = l

k and U (k) = ∑ nj j = l

being with

• i, j, k are called positive integers, • cumulants are called,

• nj a number of the used values of the

Time series ^from the past,

referred to as.

17. The device according to one of claims 11 to 16, in which the processor unit is set up such that in the event that the first time series and the second time series are assigned to the second group, the method is repeated iteratively until the first time series and the second time series are assigned to the first group, wherein the order of the Markov process is increased in each iteration.

18. A device according to any one of claims 11 to ^'17, wherein the processor unit is set up such that a financial market are described by the first time series and the second time series rate gradients.

19. Device according to one of claims 11 to 18, wherein the processor unit is set up in such a way that the course of an electroencephalogram is described by the first time series and the second time series.

20. Device according to one of claims 11 to 19, wherein the processor unit is set up such that the course of an electrocardiogram is described by the first time series and the second time series.