CN109088770B

CN109088770B - Electromechanical system interactive network modeling method based on self-adaptive symbol transfer entropy

Info

Publication number: CN109088770B
Application number: CN201810954284.6A
Authority: CN
Inventors: 高建民; 谢军太; 高智勇; 姜洪权; 陈琨; 冯龙飞
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2018-08-21
Filing date: 2018-08-21
Publication date: 2020-03-31
Anticipated expiration: 2038-08-21
Also published as: CN109088770A

Abstract

The invention discloses a process industry complex electromechanical system interactive network modeling method based on self-adaptive symbol transfer entropy, which is characterized in that public parameters of time sequence symbolization are obtained on the basis of multivariable spatial reconstruction, the probability density and distribution of an original time sequence are estimated by using a self-adaptive kernel density estimation method, the sequence is divided into equal probabilities, coarse-grained symbol representation of the original sequence is realized by obtaining the optimal number and divided regions of symbols, so that the accuracy of interactive information measure between variables is improved, the transfer entropy analysis is carried out on the symbol sequence of monitored variables, and the net information transfer quantity is calculated, so that basic parameters required by system interactive network modeling are obtained, and a network model reflecting the bottom interactive mechanism of an actual system is established. The network model provides basis for system state evaluation, fault propagation analysis and diagnosis decision, thereby improving the scientific and intelligent decision level of safe and reliable operation of the process industrial complex electromechanical system under complex working conditions.

Description

Electromechanical system interactive network modeling method based on self-adaptive symbol transfer entropy

Technical Field

The invention relates to the field of complex electromechanical system service safety state evaluation, in particular to an electromechanical system interactive network modeling method based on self-adaptive symbol transfer entropy.

Background

The process industrial production system has various production equipment and needs various auxiliary systems, the structural units continuously exchange materials, information and energy, and the system has high internal association coupling degree and is a distributed complex electromechanical system. Complex networks are currently an important theory for studying the structure, function and dynamic behavior of complex systems. Network modeling is an important means for complex system modeling, and is an active direction for the earliest research and the most achievements in the field of complex networks. In many network modeling methods, it is a general topic of interest to fully utilize observation data of a complex system to obtain information flow among different variables and construct a complex network model reflecting dynamic evolution behaviors of the complex system. The core of the complex system information flow network model construction is the accurate measurement of information flow between variables, and more specifically, the complex system information flow network model comprises two important indexes of direction and strength.

The transfer entropy is used as a non-parametric statistical method free of models, can be used for measuring directed information transfer quantity between two random processes, and is an important method for measuring nonlinear system information flow. Schreiber introduced first the delivery entropy measurement of nonlinear system information flow. Various types of techniques are then used to estimate the transfer entropy from the system observation sequence. Symbolization is the basis for analysis of symbol time series, which involves converting an original time series into a series of discrete symbols. In many cases, the degree of discretization can be quite severe. Wessel n. This global approach has drawbacks in terms of extracting local detailed information and real-time performance. Staniek improves the original transfer entropy by applying a permutation entropy symbolization method, puts forward the symbol transfer entropy for the first time, and reduces the influence of noise on an observation sequence by neglecting the structure node information of details. On the basis, Papana et al introduces part of symbolic transfer entropy, analyzes the directional causal relationship among components of a multivariable system, and provides a direct causal identification framework of non-stationary time series, but still adopts a permutation entropy symbolization method.

The improvement of the original transfer entropy in the researches greatly promotes the application of the transfer entropy method to noise-containing data in a real complex system. However, the original symbol transfer entropy is based on the symbolization principle of the arrangement entropy, and only takes the arrangement sequence of vector elements in the time sequence phase space as symbolization, which is a rough symbolization method, and may lose the structural information of the original time sequence, cause the information loss of the original sequence, and further affect the accurate measurement of the mutual information between variables. Therefore, how to ensure that the symbolic sequence expresses the original sequence structure information as much as possible and improve the anti-noise capability is the core content of research on the symbol transfer entropy and the improved method thereof.

Disclosure of Invention

The invention aims to provide an electromechanical system interactive network modeling method based on self-adaptive symbol transfer entropy so as to overcome the defects of the prior art.

In order to achieve the purpose, the invention adopts the following technical scheme:

a modeling method of an electromechanical system interactive network based on self-adaptive symbol transfer entropy is characterized by obtaining signed public parameters of an original time sequence on the basis of multivariable spatial reconstruction, estimating probability density and probability distribution of the original time sequence by using a self-adaptive kernel density estimation method, carrying out equal probability division on the original time sequence according to an equal probability division principle, obtaining the optimal number of symbols and division intervals on the basis of structural information loss and noise immunity of an equilibrium symbol sequence on the original time sequence, carrying out coarse grained symbol representation on the original time sequence, carrying out transfer entropy analysis on the symbol sequence of the original time sequence, and carrying out calculation of net information transfer quantity to obtain basic parameters required by modeling of the system interactive network, thereby establishing a network model reflecting a bottom interactive mechanism of an actual system.

Further, a variable set of a monitoring target of a complex electromechanical system to be analyzed is selected, a public parameter symbolized by an original time sequence is obtained, the obtained original time sequence data set is N monitoring variables i, the monitoring variables are subjected to noise reduction through a wavelet packet method to obtain a time sequence after noise reduction, and the signal-to-noise ratio (SNR) of the sequence before noise reduction is calculated_in(ii) a The embedding dimension m and the delay time τ of each pair of monitoring variables are calculated by a multivariate phase-space reconstruction method as a common set of parameters (m, τ) symbolized for each pair of monitoring variables.

Further, a probability density function f of the variable monitoring data after noise reduction is obtained for each monitoring variable i by adopting a self-adaptive kernel density estimation method_i(x) According to a probability density function f_i(x) Obtain a probability distribution F_i(x)。

Further, the probability distribution F obtained by using the equal probability division principle_i(x) Performing equiprobable partitioning and determining by optimization in combination with the obtained common parameter set (m, τ)Determining the symbolization parameter of each monitoring variable to obtain a symbolization sequence of the time sequence; carrying out transmission entropy analysis on the symbolic sequences of each pair of monitoring variables to obtain net information transmission quantity between each pair of monitoring variables

Taking a monitoring variable as a node v_iBelongs to V, and the information transfer relation among the monitoring variables is an edge e_iE, net volume of information transfer

Weight of as edge w_iEstablishing a network model M reflecting the bottom interaction mechanism of a complex electromechanical system by belonging to W_netAnd (V, E, W), thereby completing the modeling of the process industry complex electromechanical system interaction network.

Furthermore, the symbol sequence obtained after adaptive symbolic conversion needs to perform information transfer analysis on each pair of variables and obtain the net information transfer quantity between each pair of monitored variables

The expression of the transfer entropy between the monitoring variables is shown as the formula;

in the formula (I), the compound is shown in the specification,

the entropy is conveyed for information of Y to X,

and

is the ith value after adaptive symbolization of sequences X and Y, and δ is the time delay between sequences X and Y.

Further, the information transfer entropy of X to Y

Net volume of information transfer

The following formula

Net volume of information transfer

The sign of the value is taken as the direction of the directed edge of the system network model, "+" indicates that the information transfer direction is Y → X, "-" indicates that the information transfer direction is X → Y,

weights w as directed edges in the system network model_i。

Further, the determination of the characterization parameters of the symbolic anti-noise performance: by noise factor N_FThe noise performance of the system is quantitatively characterized, and the expression is as follows:

wherein the SNR_inFor input signal-to-noise ratio, SNR_outTo output a signal-to-noise ratio;

the information entropy H (q) of the symbolized sequence formed after the self-adaptive symbolization satisfies H (q)>H_L，H_LFor a given information lower limit, the minimum noise introduced by the symbolization process is taken as an optimization target, namely the noise coefficient N of the symbolized system_FThe minimum is an optimization target, and the optimal size q of the symbol set is obtained_optThe optimization function model of the process is as follows:

the optimization process endsThe size q of the post-output symbol set S is the optimal symbol set S for the symbolization process_optSize q of_optThe resulting symbol set S_optCan be expressed as:

S_opt＝[0,1,…,i,…,q_opt-2,q_opt-1]；

inputting the monitoring time sequence samples into the optimization function model to obtain an optimal symbol set S_optSize q of_optAnd the symbol set S in the optimization process is used_optSize q of_optOutputting the corresponding time sequence threshold value space division point set P as the optimal time sequence threshold value space division point set P_optIts optimal threshold space partition point set P_optIs expressed as:

optimal threshold space partition point set P_optThen, the original time sequence is divided into space, i.e. q_optA region, wherein a division point P_iTo P_i+1Is a divided region and the probability of the region is 1/q_opt(ii) a The symbolized function expression is as follows:

the original time series can be converted into a symbolized time series by the threshold function in the above formula.

Compared with the prior art, the invention has the following beneficial technical effects:

the invention relates to an electromechanical system interactive network modeling method based on self-adaptive symbol transfer entropy, which is characterized in that public parameters symbolized by a time sequence are obtained on the basis of multivariable space reconstruction, the probability density and distribution of an original time sequence are estimated by using a self-adaptive kernel density estimation method, the sequence is divided according to an equal probability division principle, on the basis of the structural information loss and the noise immunity of an equilibrium symbol sequence to the original time sequence, the coarse-grained symbol representation of the original sequence is realized by continuously optimizing and selecting the optimal number of symbols and division regions, so that the accuracy of interactive information measure between variables is improved, on the basis, the transfer entropy analysis is carried out on the symbol sequence of each pair of variables, the calculation of net information transfer quantity is carried out, so as to obtain basic parameters required by system interactive network modeling, and a network model reflecting the bottom interactive mechanism of an actual system is established, the network model provides basis for system state evaluation, fault propagation analysis and diagnosis decision, thereby improving the scientific and intelligent decision level of safe and reliable operation of the process industrial complex electromechanical system under complex working conditions.

Furthermore, the time sequence pair adopts the optimal embedding dimension m and the time delay tau to ensure the detection of the maximum information flow between the variable pairs, and the probability calculation complexity in the transmission entropy analysis is greatly simplified through any system coding which is adaptive to the size of the symbol set and decimal decoding, thereby improving the accuracy and the efficiency of the information transmission measure between the variable pairs.

Drawings

FIG. 1 is a schematic diagram of time series equiprobable symbol partitioning; fig. 1(a) is a raw signal, fig. 1(b) and 1(c) are a frequency histogram and a probability density curve based on AKDE, and fig. 1(d) is a cumulative probability density distribution curve.

FIG. 2 is a sample trend graph of Lorenz system variables X and Y;

FIG. 3 is the influence of the number of symbols on the entropy of the sequence information during the symbolization process;

FIG. 4 shows the variation of NF with the number of symbols under different SNR;

FIG. 5 is a comparison of permutation entropy symbolization and adaptive symbolization for variables X and Y, respectively; FIG. 5(a) is a diagram of monitoring variable x being encoded and decoded by using a permutation entropy symbolization method; fig. 5(b) is a diagram of adaptive coding and decoding method for the monitoring variable x, fig. 5(c) is a diagram of coding and decoding for the monitoring variable y by permutation entropy coding method, and fig. 5(d) is a diagram of adaptive coding and decoding method for the monitoring variable y;

FIG. 6 is a graph comparing the TE, STE and ASTE calculated information transfer variation trends under different sliding windows when the noise is 20 dB;

FIG. 7 is a graph of the trend of TE, STE and ASTE calculated net information transfer as a function of sequence length;

FIG. 8 is a comparison graph of the monitored variable 11 before and after wavelet de-noising; FIG. 8a is a diagram of the effect before noise reduction of the monitored variable, and FIG. 8b is a diagram of the effect after noise reduction of the monitored variable;

FIG. 9 is a comparison of signal-to-noise ratios of various variables of the compressor unit after wavelet de-noising;

fig. 10 shows TE, STE and ASTE information transfer entropy comparisons: FIG. 10(a) variable 11 as the source node; FIG. 10(b) variable 11 as the target node;

FIG. 11 is a comparison graph of net information transfer entropy obtained by three methods, TE, STE and ASTE: FIG. 11(a) variable 11 as the source node; FIG. 11(b) variable 11 as the target node;

FIG. 12 is a system interaction network model when a compressor unit is in normal service.

FIG. 13 is a flow chart of the system of the present invention.

Detailed Description

The invention is described in further detail below with reference to the accompanying drawings:

as shown in FIG. 13, the invention relates to an electromechanical system interactive network modeling method based on adaptive symbol transfer entropy, which obtains signed common parameters of an original time sequence on the basis of multivariable spatial reconstruction, estimates the probability density and the probability distribution of the original time sequence by using an adaptive kernel density estimation method, divides the original time sequence according to the equal probability division principle, improves the accuracy of interactive information measure between variables by continuously optimizing and selecting the optimal number of symbols and division regions on the basis of structural information loss and noise immunity of the balanced symbol sequence to the original sequence, performs transfer entropy analysis on the symbol sequence of the original time sequence (each pair of monitoring variables) and calculates net information transfer quantity, to obtain the basic parameters needed by the modeling of the system interactive network, thereby establishing a network model reflecting the actual system bottom interactive mechanism.

The complex electromechanical system interactive network modeling method based on the self-adaptive symbol transfer entropy specifically comprises the following steps:

step 1), monitoring data and preprocessing thereof. Selecting a variable set of a monitoring target of a complex electromechanical system to be analyzed, acquiring symbolic public parameters of an original time sequence, wherein the acquired original time sequence data set is N monitoring variables i, denoising the monitoring variables by a wavelet packet method to obtain a denoised time sequence, and calculating a signal-to-noise ratio (SNR) of the sequence before denoising_in。

And 2) calculating the public symbolization parameters based on the monitoring variables of the multivariate phase space reconstruction. Calculating the embedding dimension m and the delay time tau of each pair of monitoring variables by a multivariate phase space reconstruction method as a signed common parameter set (m, tau) of each pair of monitoring variables i;

step 3), self-adaptive kernel density estimation of the monitoring data sample: obtaining the probability density function f of the noise-reduced monitoring variable sample data by adopting a self-adaptive kernel density estimation method for each monitoring variable i_i(x) According to a probability density function f_i(x) Thereby obtaining a probability distribution F_i(x)；

Step 4), optimizing and determining symbolic parameters of each monitoring variable in the time series pair: estimating the probability distribution F obtained in the step 3) by utilizing an equal probability division principle_i(x) Performing equal probability division, and determining the symbolization parameter of each monitoring variable through optimization by combining the public parameter set (m, tau) obtained in the step 2 to obtain a symbolization sequence of a time sequence; the most central process in the process of determining the symbolization parameters of each monitoring variable is to determine the size of a symbol set S and a threshold space division point set P of the sequence.

Step 5), information transmission analysis between the symbolic sequences of each pair of monitoring variables. Carrying out transmission entropy analysis on a symbolic sequence obtained by symbolizing sample data of each pair of monitoring variables to obtain net information transmission between each pair of monitoring variablesDelivery of

The symbol sequence obtained after the self-adaptive symbolization conversion needs to carry out information transmission analysis on each pair of monitoring variables, and the expression of the transmission entropy is shown as the formula;

in the formula (I), the compound is shown in the specification,

the entropy is conveyed for information of Y to X,

and

is the ith value after adaptive symbolization of sequences X and Y, and δ is the time delay between sequences X and Y. In the present invention, in order to obtain the maximum information flow detection, we take the maximum delay time δ to the maximum value δ_max。

Similarly, we can obtain the information transfer entropy from X to Y

To simplify the expression of the subsequent network model, we calculate the net traffic

The following formula

In addition, net volume of information transfer

The positive and negative values of the value are taken as the direction of the directed edge of the system network model, and the plus represents the information transmission directionY → X, "-" indicates that the information transfer direction is X → Y,

weights w as directed edges in the system network model_i

The self-adaptive symbol transfer entropy comprehensively considers the number of symbols after the time series symbolization and the information loss. The time sequence pair adopts the optimal embedding dimension m and the time delay tau to ensure the detection of the maximum information flow between the monitoring variable pairs, and the probability calculation complexity in the transmission entropy analysis is greatly simplified through any system coding which is adaptive to the size of the symbol set and decimal decoding, thereby improving the accuracy and the efficiency of the information transmission measure between the monitoring variable pairs.

And 6) modeling an interactive network of the complex electromechanical system. Taking a certain monitoring variable as a node v_iBelongs to V, and the information transfer relation between each pair of monitoring variables is an edge e_iE, net traffic TE_i ^netWeight w as edge_iEstablishing a network model M reflecting the bottom interaction mechanism of a complex electromechanical system by belonging to W_netCan be expressed as:

M_net＝(V,E,W)

in the formula, V is the set of all nodes in the network, E is the set of all edges in the network, and W is the weight set of the edges in the network, so that the process industry complex electromechanical system interactive network modeling is completed.

The most important step in the symbol-transfer entropy analysis is to perform coarse-grained symbolic representation on the original time series and perform transfer entropy analysis on the basis of the coarse-grained symbolic representation. The time series symbolization mainly comprises the steps of determining the size of a symbol set, determining an optimal subspace division point set in a reasonable value range of a monitoring variable, and determining the division space according to division points. The monitoring variable value space division method adopted in the symbolization process influences the subsequent symbol sequence analysis, so that the symbolization process of the monitoring variable needs to be adaptively adjusted by tightly depending on the distribution characteristics of the sample.

1) Symbolic equiprobable partitioning principle description

For convenience of description of the method, an example of probability density division using an actual compressor set vibration signal is described here, and the process is shown in fig. 3. Where fig. 1(a) is the original signal, fig. 1(b) and 1(c) are the frequency histogram and the AKDE-based probability density profile, and fig. 1(d) is the cumulative probability density distribution profile. When the symbol set is 4, according to the equal probability division principle, fig. 1(d) plots the cumulative probability density distribution curves with the division points P1, P2, P3, and P4, so that the threshold value of the signal is divided into 4(q ═ 4) intervals, each indicated by the symbols "0", "1", "2", and "3", respectively.

The method comprises the following steps of (1) optimally determining symbolic parameters of each monitoring variable in a time series pair, wherein the steps specifically comprise the following steps:

(1) determination of characterizing parameters that symbolize the noise immunity performance. The Noise performance of the system is quantitatively characterized by introducing a Noise coefficient NF (Noise Factor) in the field of electronics. The expression is

Wherein the SNR_inFor input signal-to-noise ratio, SNR_outTo output a signal-to-noise ratio. The coefficient is also a parameter that characterizes the degree of degradation of the noise performance of the system. It can be seen that the larger the value is, the better the value is, and the larger the value is, the more noise is mixed in the transmission process, which reflects the non-ideality of the device or channel characteristics.

(2) And (4) establishing a symbolized parameter optimization model.

The invention ensures that the information entropy H (q) of the symbol sequence formed after the self-adaptive symbolization meets H (q)>H_L(H_LGiven a lower information limit), the minimum noise introduced by the symbolization process is taken as an optimization target, namely the noise coefficient N of the symbolized system_FThe minimum is an optimization target, and the optimal size q of the symbol set is obtained_optThe optimization function model of the process is shown below

Outputting the size q of the symbol set S after the optimization process is finished, wherein the value is the optimal symbol set S of the symbolization process_optSize q of_optThe resulting symbol set S_optCan be expressed as:

S_opt＝[0,1,…,i,…,q_opt-2,q_opt-1]

(3) the time series is monitored for optimal symbolization transformation.

Inputting the monitoring time sequence samples into the optimization function model to obtain an optimal symbol set S_optSize q of_optAnd the symbol set S in the optimization process is used_optSize q of_optOutputting the corresponding time sequence threshold value space division point set P as the optimal time sequence threshold value space division point set P_optIts optimal threshold space partition point set P_optCan be expressed as

Optimal threshold space partition point set P_optThen, the original time sequence needs to be spatially divided, i.e. divided into q_optAnd (4) a region. Since each region is continuous, it can be determined by the interval between division points. Wherein the dividing point P_iTo P_i+1Is a divided region and the probability of the region is 1/q_opt。

The symbolized function expression is shown below

The original time sequence can be converted into a symbolic time sequence through the threshold function in the formula, so that a foundation is laid for accurate and rapid calculation of transfer entropy between the time sequences.

Simulation sequence analysis of Lorenz chaotic system

The method is applied to the whole process of modeling of the dynamic interactive network of the complex system by applying a typical nonlinear chaos Lorenz system description algorithm. Firstly, explaining the return stroke, parameters and test sample data of a Lorenz system of a simulation system; next, the proposed ASTE method is compared with the existing TE, STE methods by a noise addition experiment to illustrate the applicability and advantages of the proposed method in a noisy environment. And finally, selecting different initial value conditions to generate a simulation sequence under the condition of determining system parameters, constructing a Lorenz system dynamic interaction network model by applying an ASTE method, and verifying the characterization capability of the method on the initial value sensitivity characteristics of the system through the difference of network structures.

(1) Lorenz system and its parameter description

The Lorenz equation is a three deterministic first-order nonlinear differential equation established by the american famous meteorologist Lorenz in 1963 for studying climate change through the study of convection experiments. The three equations are classical equations in the chaos field, the Lorenz system is also the first continuous power system expressing a strange attractor, and the Lorenz system plays a very important role in the simulation analysis of a complex system due to the definition of the meaning of monitoring variables and the simplicity of the equations.

The expression of the Lorenz equation is as follows:

the parameters are selected from-10, r 28 and b 8/3 as positive real constants, and the system is in a chaotic state, so that the chaotic evolution process and characteristics of the system are determined. A set of initial values for the monitored variables X, Y, Z is given as X-1, Y-0, and Z-1. And then, a fourth-order Runge Kutla method is adopted, the time step is taken as 0.0l, integration is carried out, and a simulation sequence with the time sequence length of 35000 of 3 monitoring variables is obtained.

(2) Test sample data selection

The first 2000 sequences of the Lorenz system monitoring variables X and Y are selected as analysis samples for anti-noise performance analysis of the algorithm. The sequence trend graph is shown in fig. 2.

(3) Adaptive symbolization process for time series pairs

According to the modeling characteristics of the system network, the interaction between each pair of monitoring variables needs to be analyzed. Particularly, the symbolic common parameters of each monitoring variable sample sequence are reconstructed and determined for the phase space of each variable, then the optimal symbol set size and the threshold space division point of each pair of monitoring variables are obtained based on an adaptive kernel density improvement method, and the threshold space is divided and symbolized.

(a) Information entropy of symbol sequences

To investigate the effect of the number of symbols on symbolization, the information entropy of three types of sequences were compared: an original time sequence, a symbol sequence and a coded and decoded sequence. H_orig，H_symAnd H_EDRespectively their corresponding information entropy. The information entropy curves of the X symbolization process shown in fig. 6 have different numbers of symbols.

From the information entropy trend in fig. 6, as the number of symbolization increases, the information entropy of the symbol sequence tends to increase. Therefore, in order to realize the optimal division of the monitoring variable threshold space, the size q of the symbol set S cannot be determined or cannot be determined solely according to the size of the information entropy, and needs to be determined in combination with other constraint conditions.

(b) Noise measurement of symbol process

The process of symbolizing the time sequence may cause the structure of the sequence to change, which is equivalent to introducing new noise. In order to measure the noise possibly introduced in the symbolization process, the noise coefficient introduced in the invention is used for describing the anti-noise capability of the symbolization process.

It can be seen from the variation trend of the noise coefficient shown in fig. 4 that when the signal-to-noise ratio is 10dB, 20dB, 30dB, and the size q of the symbol set S is between 2 and 21, the noise coefficient presents a bathtub curve, and as the noise intensity increases, the noise coefficient decreases, and the "bottom of the bathtub" tends to be flat; when the number of the symbols of each noise coefficient is 10-12, the noise coefficient obtains the minimum value, which shows that the noise intensity has little influence on the reasonable division of the monitoring variable threshold value space, and the symbolization process has stronger consistency and anti-noise capability.

A comparison of adaptive symbolization and permutation entropy symbolization for variable X and Y is shown in fig. 5. Fig. 5 compares the coding and decoding effects of symbol sequences produced by different time-series symbolization methods: (a) respectively encoding and decoding the monitoring variables x and y by using a permutation entropy symbolization method; (b) and (d) adaptive symbol encoding and decoding methods are respectively used for the monitoring variables x and y.

As shown in fig. 5, compared with the permutation entropy symbolization method, we can clearly find that the sequence obtained by using the adaptive symbolization method more accurately expresses the basic structural features of the original time sequence.

(4) Anti-noise performance analysis of algorithms

(a) Analysis of influence of specific noise on different symbol transfer entropy methods

The time sequence forms a symbol sequence after self-adaptive symbolization, and the transmission entropy calculation of the symbol sequence is the basic work of system interactive network modeling. In order to verify the superiority of the method provided by the invention, under the condition of ensuring that the time sequence length is kept consistent, the interaction between Lorenz variables X, Y and Z is analyzed by applying the traditional transfer entropy, the symbol transfer entropy and the ASTE method related to the invention, and Gaussian white noise of 10dB, 20dB and 30dB is added in the selected X and Y sample sequences in sequence for the anti-noise performance analysis of the algorithm.

As shown in fig. 6, by comparing and analyzing the results calculated by the existing transfer entropy method and the sign transfer entropy method, we can easily find that each method can detect the interaction between the monitored variables in the Lorenz system. But in terms of the amount of information transferred, the detection result obtained by the ASTE method is obviously higher than that obtained by STE and TE, which shows that the ASTE method obviously improves the transfer amount of information; from the curve variation trend, the curve trends obtained by the ASTE and STE methods are basically consistent, which shows that the ASTE and the traditional STE algorithm have slightly better performance than the TE, and the noise resistance and the stability of the algorithm are obviously improved by mainly benefiting from the fact that the first two methods adopt symbolic filtering. Therefore, the ASTE method is better than the existing TE and STE algorithms in the aspects of information quantity detection, noise immunity and stability, and shows that the anti-noise performance is improved by the symbolic optimization algorithm based on the adaptive kernel density estimation, so that the noise reduction performance in the symbolic process achieves the optimal effect.

(b) Analysis of influence of sequence length on adaptive symbol transfer entropy method

To further explore the effect of the length of the time series on the proposed algorithm, time trials were performed by using time series data of different lengths. FIG. 7 is a comparison of time series length symbol transfer entropy under certain noise conditions

As is apparent from fig. 7, the longer the monitor variable sample, the larger the calculated value for the symbol transfer entropy. Therefore, in order to ensure the balance between the multivariate, the length of the sample participating in the calculation each time, i.e. the window width of the sliding window, must be reasonably determined, and here we use 2500 as the length of the monitoring variable for the length of the sliding window of X and Y for the ASTE analysis.

Interactive network modeling and analysis of actual compressor sets

The modeling method related by the invention is explained in detail by using monitoring data generated in the normal service process of a certain coal chemical industry enterprise compressor unit.

The compressor unit monitoring variables used in the example verification of the present invention are shown in table 1:

TABLE 1 monitoring variable meter of compressor unit

Numbering	Variable names	Description of variables	Unit of
				1	A_ALI7650	Liquid level of steam condenser	％
2	A_ALI7651	Liquid level of steam condenser	％
				3	A_API7654	Exhaust pressure of steam turbine	MPa
4	A_API7655	Steam extraction pressure of steam turbine	MPa
				5	A_API7658	Steam turbine inlet pressure	MPa
6	A_ATI7650A	Temp. of thrust negative thrust bush of steam turbine	℃
				7	A_ATI7651	Temp. of thrust positive thrust bush of steam turbine	℃
8	A_ATI7653	Steam turbine exhaust side support bearing temperature	℃
				9	A_ATI7654	Inlet temperature of steam turbine	℃
10	A_ATI7655	Steam extraction temperature of steam turbine	℃
				11	A_ATI7656	Exhaust temperature of steam turbine	℃
12	A_RXI7650	Shaft vibration of steam turbine	um
				13	A_RXI7651	Shaft vibration of steam turbine	um
14	A_RXI7652	Shaft vibration of steam turbine	um
				15	A_RXI7653	Shaft vibration of steam turbine	um
16	A_RZI7650	Steam turbine shaft displacement	mm

In order to eliminate the influence of magnitude difference and noise on the state of an analysis system in data acquired by a DCS, the data preprocessing process of normalization and noise reduction processing is carried out on original data.

According to the method of network modeling proposed in the present invention, the interaction between each pair of variables needs to be analyzed. The modeling work of the system interactive network mainly comprises the steps of firstly reconstructing the phase space of each monitoring variable, determining the symbolized public number and independent parameters, realizing the symbolization of a time sequence, and analyzing the transfer entropy of each pair of monitoring variables on the basis.

The method mainly comprises the following steps:

(1) adaptive symbolized common parameter solution for state monitoring sequences

And determining parameters of phase space reconstruction of each monitored variable by using a mutual information method and a mutual information method. In order to embed each monitoring variable of the variable pair into a reconstruction phase space having the same dimension and to ensure that each monitoring variable is spread without distortion, m and τ are a common embedding dimension and a time delay, respectively, according to a common reconstruction parameter. Table 2 lists the multivariate reconstruction parameters m and τ for the variable pairs formed by the monitored variable 11 and the other monitored variables:

table 2 common reconstruction parameters of the monitored variable 11 and other monitored variables forming a variable pair

It can be seen that the reconstruction parameters of each pair of monitored variables in table 2 are different, which reflects the multi-dimensionality of the multivariable when expressing the system state, and can make up for the deficiency of a single variable when describing the system state.

(2) Monitoring noise estimation of time series

The noise contained in the monitoring sequence not only degrades the quality of the signal, but also severely affects the effectiveness of various correlation processing algorithms. Noise estimation is very important for various types of signal processing. In a practical system, the noise of monitoring data is inevitable, and information loss and noise resistance are balanced in the time-series adaptive symbolization process. This requires that the noise in the raw data must be adequately estimated prior to application of the data, thereby providing a basis for optimal determination of the symbolization parameters of the monitored sequence.

A wavelet packet denoising method is adopted, the data denoising process comprises two steps of decomposition and reconstruction, ①, after appropriate wavelet basis functions and decomposition layer numbers are adopted for different variables, soft threshold processing is carried out on each decomposed wavelet detail coefficient by using a fixed threshold method, ②, the last layer of approximate coefficient and detail coefficients of all layers are reconstructed, and a variable time sequence diagram after denoising is obtained, wherein the variable time sequence diagram is shown in fig. 8a and 8 b.

As can be seen from fig. 8, after the normalized data is denoised, the overall trend of the signal does not change, and part of the high frequency noise is filtered out. It can be seen that the sequence structure is clearer after noise reduction, and the signal-to-noise ratio after noise reduction of each variable is shown in fig. 9.

It can be seen from fig. 9 that in the actual monitoring sequence, each variable contains a different level of noise. The noise evaluation result of the original sequence provides a basis for the symbolization of the time sequence. The main expression is that each variable monitoring sequence is optimized to obtain the optimal number of symbols and corresponding threshold interval in the symbolization process, and the symbol sequence generated according to the division result can best reflect the structure of the sequence after noise reduction.

(3) Optimal symbol set and variable threshold space partitioning

The time series adaptive symbolization method provided by the invention is applied to symbolize the monitoring sequences of 16 monitoring variables related in the embodiment of the invention, and the size of the symbol set of each variable is shown in table 3.

TABLE 3 adaptive symbolized symbol set size for variables

Variable numbering	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
																	Number of symbols	60	61	48	7	16	126	160	91	200	200	121	36	39	32	31	2

As can be seen from table 3, the number of symbols reflects the structural complexity of the time series to some extent. The more complex the structure of the time series, the more symbols are needed, whereas the simpler the structure of the time series, the fewer symbols are needed. After the optimal symbol set is obtained, a corresponding variable division point set can be obtained, so that the time sequence is symbolized.

(4) Symbol-passing entropy calculation

In order to verify the superiority of the method proposed by the present invention, the interaction between the variable 11 most closely related to the fault and other variables was analyzed by applying the conventional transfer entropy, the symbol transfer entropy and the symbol transfer entropy method improved by the present invention based on adaptive kernel density estimation, i.e. the information transfer entropy was calculated by applying the conventional TE and STE and ASTE methods with the node 11 as the source node and the target node, respectively, as shown in fig. 10(a) and 10 (b).

To be straightObserving the asymmetry of the interaction between nodes, the net traffic TE between nodes when the node 11 in section 2.2, equation 8, is used as the source node and the destination node respectively_netAs shown in fig. 11.

From fig. 11(a) and 11(b), it can be found that each method can detect the interaction between the variable 11 and other variables. Comparing the traditional TE method, STE method and ASTE method, it can be found that the performance of STE method and ASTE method is significantly larger than the Transfer Entropy (TE), which indicates that the symbol transfer entropy method has a certain anti-noise performance. The entropy of information transfer between the node 11 and other network nodes calculated by the ASTE method is larger than the results of the traditional TE and STE, which shows that the ASTE method improves the effectiveness of the information transfer measure.

(5) Construction of complex electromechanical system service interaction network model

The ASTE analysis between each pair of variables is calculated through the above process, so that the weight and direction between the network node and its node are determined, and the weight and direction are used for the network model construction of the system, and the constructed network model is as shown in fig. 12.

As can be seen from the network model in fig. 12, the interaction between the nodes in the established network model is similar to a strongly connected network, which indicates that the systems are closely related and work in coordination during the working process of the actual system, and the working mechanism is complex. Among all nodes, the

nodes

1, 2, 3, 12,13,14, 15 and the like have strong interaction with other nodes, and the other nodes are relatively small, which indicates that the variables and the fault are representative in the fault of the characterization system.

The validity of the information model is proved by comparing with the physical structure and meaning of each variable. Based on the description of each variable, the

variables

1, 2 and 3 can reflect the heat exchange performance of the condenser of the steam turbine, the

variables

12,13,14 and 15 are the main monitoring of the vibration of the bearing of the steam turbine, and the variable parameters are the important basis of feedback regulation and control in the normal operation of the steam turbine. Therefore, the information model established by the method conforms to the known system operation mechanism, and reflects and reconstructs the actual relationship among the components of the system.

Claims

1. A modeling method of electromechanical system interactive network based on self-adaptive symbol transfer entropy is characterized in that, obtaining symbolic public parameters of the original time sequence on the basis of multivariate spatial reconstruction, estimating the probability density and probability distribution of the original time sequence by using an adaptive kernel density estimation method, performing equal probability division on the original time sequence according to an equal probability division principle, on the basis of equalizing the structural information loss and noise immunity of the symbol sequence to the original time sequence, obtaining the optimal number of symbols and division regions, coarse grained symbolic representation is carried out on the original time sequence, then transmission entropy analysis is carried out on the symbolic sequence of the original time sequence, the net information transmission quantity is calculated, obtaining basic parameters required by modeling of the system interactive network, and establishing a network model reflecting the actual system bottom interactive mechanism; the method specifically comprises the following steps:

step 1), selecting a variable set of a monitoring target of a complex electromechanical system to be analyzed, acquiring symbolic public parameters of an original time sequence, wherein the acquired original time sequence data set is N monitoring variables i, denoising the monitoring variables by a wavelet packet method to obtain a denoised time sequence, and calculating the signal-to-noise ratio (SNR) of the sequence before denoising_in；

Step 2), calculating the embedding dimension m and the delay time tau of each pair of monitoring variables by a multivariate phase space reconstruction method, and taking the embedding dimension m and the delay time tau as a signed public parameter set (m, tau) of each pair of monitoring variables i;

step 3), obtaining the probability density function f of the noise-reduced monitoring variable sample data by adopting a self-adaptive kernel density estimation method for each monitoring variable i_i(x) According to a probability density function f_i(x) Thereby obtaining a probability distribution F_i(x)；

Step 4), estimating the probability distribution F obtained in the step 3) by utilizing an equal probability division principle_i(x) Performing equal probability division, and determining the symbolization parameter of each monitoring variable by optimizing the public parameter set (m, tau) obtained in the step 2) to obtain a symbolization sequence of the time sequence; determining symbolization of each monitored variableThe most central in the parameter process is to determine the size of a symbol set S and a threshold space division point set P of a sequence;

step 5), carrying out transmission entropy analysis on the symbolic sequence obtained by symbolizing the sample data of each pair of monitoring variables to obtain the net information transmission quantity between each pair of monitoring variables

Carrying out information transmission analysis on each pair of monitoring variables by using a symbol sequence obtained after self-adaptive symbolization conversion, wherein the expression of the transmission entropy is shown as the formula;

in the formula (I), the compound is shown in the specification,

the entropy is conveyed for information of Y to X,

and

is the ith value after adaptive symbolization of sequences X and Y, δ is the time delay between sequences X and Y;

step 6), taking a certain monitoring variable therein as a node v_iBelongs to V, and the information transfer relation between each pair of monitoring variables is an edge e_iE, net volume of information transfer

Weight w as edge_iEstablishing a network model M reflecting the bottom interaction mechanism of a complex electromechanical system by belonging to W_netCan be expressed as:

M_net＝(V,E,W)

in the formula, V is the set of all nodes in the network, E is the set of all edges in the network, and W is the weight set of the edges in the network, so that the process industry complex electromechanical system interactive network modeling is completed;

determination of characterization parameters of the symbolized anti-noise performance: by noise factor N_FThe noise performance of the system is quantitatively characterized, and the expression is as follows:

outputting the size q of the symbol set S after the optimization process is finished, namely the optimal symbol set S of the symbolization process_optSize q of_optThe resulting symbol set S_optCan be expressed as:

S_opt＝[0,1,…,i,…,q_opt-2,q_opt-1]；

optimal threshold valueSpace division point set P_optThen, the original time sequence is divided into space, i.e. q_optA region, wherein a division point P_iTo P_i+1Is a divided region and the probability of the region is 1/q_opt(ii) a The symbolized function expression is as follows: