CN111950772A

CN111950772A - Time series online prediction method based on multi-information perception

Info

Publication number: CN111950772A
Application number: CN202010709391.XA
Authority: CN
Inventors: 刘震; 王昊; 宋红霞; 程玉华; 白利兵; 陈凯
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2020-07-22
Filing date: 2020-07-22
Publication date: 2020-11-17

Abstract

The invention discloses a time sequence online prediction method based on multi-information perception, which comprises the steps of firstly utilizing offline data to initialize an offline training set and a network model, then respectively conducting data similarity perception aiming at input data at the current moment, calculating point similarity and region similarity, further calculating input data similarity weight, hidden layer output weight and prediction error, then calculating prediction error weight according to the input data similarity weight and the prediction error at the current moment, finally completing online training set updating, realizing online prediction of a time sequence, and rapidly and accurately completing online prediction of the time sequence.

Description

Time series online prediction method based on multi-information perception

Technical Field

The invention belongs to the technical field of data-driven reliability prediction, and particularly relates to a time series online prediction method based on multi-information perception.

Background

The core of the extreme learning machine algorithm is built on the structural basis of a single hidden layer feedforward neural network (SLFN) and the Moore-Penrose generalized inverse theory basis. With the development of machine learning algorithms, especially the change of learning data states, the early static data learning is developed into dynamic streaming data learning. The algorithm learning state of the extreme learning machine is also expanded from off-line static learning to on-line real-time learning. In off-line static learning, the accuracy and effectiveness of extreme learning machine results depend on the selection of training set data to a great extent. However, for the time-varying situations, such as stock price estimation, real estate market price estimation, weather forecast, gesture recognition, face recognition, etc., the training set originally selected based on experience or other methods cannot effectively meet the requirements. Time-varying conditions are mainly manifested as real-time changing changes in the data distribution characteristics or the variation trend. The solution is two, which are off-line multiple learning and on-line real-time learning. The offline multiple learning is represented by an incremental extreme learning machine (I-ELM), which improves the result accuracy by changing the hidden layer structure of the SLFN to increase the calculation strength. The increase of the calculation intensity inevitably leads to the increase of the calculation amount and the calculation time, which is unacceptable for the stream data learning or the timeliness application of the limited time interval. In contrast, an online learning algorithm represented by an online extreme learning machine (OS-ELM) among extreme learning machines is favored by researchers because of its excellent timeliness. The OS-ELM expands the learning process from the training set stage to the whole process of streaming data, so that the dependence of an extreme learning machine on the training set is eliminated to a great extent. In practical applications, researchers have conducted a great deal of research into addressing the timeliness of streaming data.

In the aspect of time series prediction, the online extreme learning machine and the derived algorithm thereof still have the following two problems:

(1) the global perception capability aiming at time series data characteristics and trend changes is insufficient, and the comparison of data characteristics between adjacent data points still remains;

(2) in the offline learning stage of the extreme learning machine, the learning ability of the extreme learning machine is selected according to the training data set; in the online learning stage, because the SLFN network structure and the hidden layer coefficient remain unchanged, the online prediction effect is greatly influenced by the change of streaming data, and especially when the streaming data has effective mutation change, the online prediction effect does not have a faster model-enhanced learning ability and a faster response ability.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a time sequence online prediction method based on multi-information perception, and the accuracy of prediction in a time sequence is improved through an online extreme learning machine based on multi-information perception and limited incremental learning.

In order to achieve the above object, the present invention provides a method for predicting time series online based on multi-information perception, which is characterized by comprising the following steps:

(1) initialization of offline training set and network model by offline data

(1.1) selecting data under line to form an under line training set

Wherein N is₀For offline training set sample size, x_iIs the input data at the ith time, t_iThe output data at the ith moment; rⁿ、R^mRespectively representing real number domains with dimensions n and m,

(1.2) randomly generating hidden layer coefficients of SLFN network

Wherein, w_iRepresenting the coefficient between the input layer and the hidden layer at the ith time instant, b_iA coefficient representing a hidden layer at the ith time, wherein R represents a real number field; obtaining an initial hidden layer output matrix H through pseudo-inverse calculation₀And initial hidden layer output weight beta₀Completing the initial offline result prediction; most preferablyCalculating the optimal hidden layer node number L under the condition of minimum training error by utilizing an offline training set₀；

(2) Calculating the point similarity and the region similarity

Let the input data input at the current ith moment be [ (x)_i)]{x_i∈RⁿThe input data table of the time j adjacent to the point is [ (x)_j)]{x_j∈RⁿJ ═ i-1}, and the neighborhood input data is denoted by [ (x)_q-+1,x_q-+2,...x_q-+(-1),x_q)]{x_q∈RⁿQ ═ i-1, > 1}, denotes the neighborhood time span;

respectively calculating point similarity gamma by using Euclidean distance similarity calculation formula_ij＝γ(x_i,x_j) And the region similarity γ_iq＝γ(x_i,E(x_q-+1,...,x_q))；

(3) Calculating input data similarity weight, hidden layer output weight and prediction error according to the point similarity and the region similarity;

the input data similarity weight input at the ith moment is as follows:

wherein gamma is_ijIs the point similarity, γ_iqIs the regional similarity;

the output weight of the hidden layer at the ith moment is as follows:

wherein H_iA hidden layer output matrix for the input data at time i,

the prediction error at the i +1 th time is:_i+1＝t_i+1-S_i+1in which S is_i+1＝H_i+1β_iThe predicted value is the (i + 1) th moment;

(4) calculating a prediction error weight according to the input data similarity weight and the prediction error at the current moment; lambda [ alpha ]_i+1＝1-exp((-1)/(Ω_i+1×_i+1))；

(5) Updating an online training set;

setting an online training set, wherein the online training set comprises a scattered point set and an area set which are connected in series; wherein, the scattered point set updating time sequence is set to be updated in real time according to the principle of 'first-in first-out'; setting the region set as a triggered update, wherein the time sequence of the scattered point set is earlier than that of the region set; the initial scattered point set is an offline initial training set, and the initial region set is an empty set;

the capacity of the scatter gather during on-line updating is set as follows:

the capacity of the set of regions when updating on line is set as follows: q ═ L_k-D, wherein L_kHiding the number of layer nodes for the SLFN network at the kth moment;

let the scatter set at the k-th time be [ (x)_k,t_k,λ_k)]{x_k∈Rⁿ,t_k∈R^m,λ_kE.g. R, completing prediction according to the steps (2) - (4), and weighting the prediction error of the previous k moments by lambda₁～λ_kPerforming descending order arrangement, and then selecting the first N_dThe lambda values correspond to data update scatter sets:

wherein

When the area set is updated, Q continuous time data before the kth time are selected as area set online updating data, and the area set online updating data are expressed as follows:

when scattered point set and area set exist N_sWhen the same data exists, the repeated data in the scatter point set is deleted, and N is added in the area set_sThe data, namely:

(6) online prediction of time series;

(6.1) calculating a k-th time parameter delta rho_k：Δρ_k＝Δe_k-Δe_k-1(ii) a Wherein, Δ e_k＝||e_k||²-||e_k-1||²，e_kFor the output error at the k-th time, Δ e_kThe square difference of the output error at the kth moment and the output error at the kth-1 moment is obtained;

(6.2) if Δ ρ_kIf the value is more than 0, the formula in the step (3) is adopted

Computing hidden layer output weights beta_kAnd use of S_k+1＝H_k+1β_kObtaining a k +1 predicted value;

(6.3) if Δ ρ_kIf the current time is less than 0, triggering on-line updating of the area set is carried out, and the specific process is as follows:

(6.3.1) increasing the SLFN network model by LEach hidden layer node satisfies L_k+L≤L_max，L_maxDesigning the maximum node number of a hidden layer for a network model; randomly generating LNew hidden layer coefficients;

(6.3.2) calculating the hidden layer output matrix after adding the hidden layer nodes

Where N is the sample size of the data set, and N ═ L_k；

(6.3.3), calculating the output matrix of the hidden layer at the moment:

wherein: w ═ H ((I-H)₀H₀ ⁺)H)⁺，U＝H₀ ⁺(I-H^TD)，H₀ ⁺Is H₀Is an inverse matrix;

(6.3.4) computing a hidden layer output matrix of the new network structure for the sample at time k:

(6.3.5), substituting equation:

obtaining an estimated value

(6.3.6) mixing

Substituting into formula

Where τ is the number of iterations when

When k is constant, stopping iteration; order to

Reuse of S_k+1＝H_k+1β_kTo obtain the kth+1 predict value.

The invention aims to realize the following steps:

the invention relates to a time sequence online prediction method based on multi-information perception, which comprises the steps of firstly utilizing offline data to initialize an offline training set and a network model, then respectively conducting data similarity perception on input data at the current moment, calculating point similarity and region similarity, further calculating input data similarity weight, hidden layer output weight and prediction error, then calculating prediction error weight according to the input data similarity weight and the prediction error at the current moment, finally realizing time sequence online prediction after completing online training set updating, and having the advantages of quickly and accurately completing online prediction of time sequences.

Meanwhile, the time sequence online prediction method based on multi-information perception also has the following beneficial effects:

(1) the invention comprehensively considers the similarity relation between the current time data and the adjacent point data and the area data, and the influence of the hidden layer output matrix and the prediction error, and provides the data similarity weight, thereby solving the defect of insufficient global perception capability of time series data characteristics and trend change;

(2) according to the method, on the basis of the limited algorithm memory of the extreme learning machine, the prediction error weight and the online training data set are designed, and the online learning capacity of the extreme learning machine is improved through the incremental learning algorithm, so that the online prediction effect is not influenced by the change of streaming data, and particularly when the streaming data has effective mutation change, the method has the rapid model reinforcement learning capacity and the rapid response capacity;

(3) in order to verify the effect of the algorithm provided by the invention, an online extreme learning machine and a partial derivative algorithm thereof are respectively selected to carry out centralized prediction simulation on several common time series data, and the performance characteristics are compared through experiments, so that the algorithm provided by the invention has good prediction performance.

Drawings

FIG. 1 is a flow chart of the multi-information perception-based time series online prediction method of the invention;

FIG. 2 is a diagram of single prediction error change of Sinc data of each algorithm;

FIG. 3 is a graph showing the single prediction error variation of Rossler data of each algorithm;

FIG. 4 shows the variation of single prediction error of Chen data of each algorithm;

FIG. 5 shows the variation of single prediction error of Lorentz data of each algorithm;

FIG. 6 shows the variation of single prediction error of Mackey-Glass data of each algorithm;

FIG. 7 is F1# IGBT accelerated life experiment prediction result data;

FIG. 8 is F3# IGBT accelerated life experiment prediction result data;

fig. 9 is data of the predicted result of the accelerated life test of the IGBT # F4.

Detailed Description

The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.

Examples

FIG. 1 is a flow chart of the multi-information perception-based time series online prediction method.

In this embodiment, as shown in fig. 1, the method for online predicting a time series based on multi-information sensing of the present invention specifically includes the following steps:

s1, initializing offline training set and network model by using offline data

Offline selection data constitutes an offline training set

randomly generating hidden layer coefficients for SLFN networks

Wherein, w_iRepresenting the coefficient between the input layer and the hidden layer at the ith time instant, b_iA coefficient representing a hidden layer at the ith time, wherein R represents a real number field; obtaining an initial hidden layer output matrix H through pseudo-inverse calculation₀And initial hidden layer output weight beta₀Completing the initial offline result prediction; finally, calculating the optimal hidden layer node number L under the condition of minimum training error by utilizing an offline training set₀；

And S2, respectively sensing the data similarity of the input data at the current moment, wherein the data similarity comprises point similarity and region similarity. Wherein, the input data input at the current ith moment is assumed to be [ (x)_i)]{x_i∈RⁿThe input data table of the time j adjacent to the point is [ (x)_j)]{x_j∈RⁿJ ═ i-1}, and the neighborhood input data is denoted by [ (x)_q-+1,x_q-+2,…x_q-+(-1),x_q)]{x_q∈RⁿQ ═ i-1, > 1}, denotes the neighborhood time span;

respectively calculating point similarity gamma by using Euclidean distance similarity calculation formula_ij＝γ(x_i,x_j) And the region similarity γ_iq＝γ(x_i,E(x_q-+1,…,x_q))；

S3, calculating the similarity weight of the input data, the output weight of the hidden layer and the prediction error according to the point similarity and the region similarity;

the input data similarity weight input at the ith moment is as follows:

wherein gamma is_ijIs the point similarity, γ_iqIs the regional similarity;

the output weight of the hidden layer at the ith moment is as follows:

wherein H_iA hidden layer output matrix for the input data at time i,

s4, calculating a prediction error weight according to the similarity weight of the input data at the current moment and the prediction error; lambda [ alpha ]_i+1＝1-exp((-1)/(Ω_i+1×_i+1))；

S5, updating an online training set;

the online training set is formed by connecting a scattered point set and an area set in series, the updating time sequence of the scattered point set is updated in real time according to the principle of 'first-in last-out', the area set is updated in a triggering mode, and the time sequence of the scattered point set is earlier than that of the area set. The initial scattered point set is an offline initial training set, and the initial region set is an empty set;

the capacity of the scatter gather during on-line updating is set as follows:

let the scatter set at the k-th time be [ (x)_k,t_k,λ_k)]{x_k∈Rⁿ,t_k∈R^m,λ_kE.g. R, completing prediction according to steps S2-S4, and weighting the prediction error of the previous k moments by lambda₁～λ_kPerforming descending order arrangement, and then selecting the first N_dThe lambda values correspond to data update scatter sets:

wherein

When the area set is updated, the k-th time is selectedThe first Q consecutive time data are taken as area set online update data and are expressed as:

s6, online prediction of time series;

s6.1, calculating a k-th time parameter delta rho_k：Δρ_k＝Δe_k-Δe_k-1(ii) a Wherein, Δ e_k＝||e_k||²-||e_k-1||²，e_kFor the output error at the k-th time, Δ e_kThe square difference of the output error at the kth moment and the output error at the kth-1 moment is obtained;

s6.2, if Δ ρ_kIf > 0, then according to the formula in step S3

s6.3, if Δ ρ_kIf the current time is less than 0, triggering on-line updating of the area set is carried out, and the specific process is as follows:

s6.3.1, increasing SLFN network model by LEach hidden layer node satisfies L_k+L≤L_max，L_maxDesigning the maximum node number of a hidden layer for a network model; randomly generating LNew hidden layer coefficients;

s6.3.2 calculating hidden layer output matrix after adding hidden layer nodes

Where N is the sample size of the data set, and N ═ L_k；

S6.3.3, calculating the hidden layer output matrix at this time:

s6.3.4, calculating a hidden layer output matrix of the new network structure of the sample at the k moment:

s6.3.5, substitution formula:

obtaining an estimated value

S6.3.6, will

Substituting into formula

In the course of the iteration(s),wherein τ is the number of iterations when

Stopping iteration, wherein kappa is a constant and takes the value of 0.1 or 0.01; order to

Reuse of S_k+1＝H_k+1β_kAnd obtaining a k +1 predicted value.

To illustrate the technical effects of the present invention, 5 sets of typical time series data and 3 sets of IGBT accelerated life experiments were completed in total for verification.

In real practice, since the researcher cannot predict the future data of the time series in advance, the initial training sample is the first 100 continuous data of each time series. The detailed information of the data is shown in table 1.

TABLE 1

Experimental error evaluation index Root Mean Square Error (RMSE):

wherein S_iIs the ith predicted value, T_iThe ith true value. The algorithm predicts the speed index with the single maximum prediction time of the algorithm. All experiments were performed 20 times and the results averaged.

The PC device used to run the algorithm was configured as Windows10(64 bit), i7 handles 1.8GHz, RAM4 GB. The software is Matlab 2018b version.

Firstly, aiming at each data set in the table 1, algorithm performance comparison tests such as an ML-OSELM algorithm and an online extreme learning machine are carried out. Wherein: the type of the activation function is a sig function, the steady state threshold of the ML-OSELM is 0.1, the number of nodes of the hidden layer newly added at a time is 1, and the number of the maximum nodes of the hidden layer is 50. The number of the initial optimal hidden layer nodes of the SLFN network is determined by the minimum training error of initial training data. The experimental error results and the single maximum predicted speed are shown in table 2.

TABLE 2

As can be seen from table 2:

(1) in terms of training errors, the online extreme learning machine and the partial derivation algorithm have extremely low training errors due to the theoretical characteristics of the extreme learning machine.

(2) In terms of prediction error, the prediction error of the M-OSELM algorithm is mostly lower than that of the other algorithms except for the Lorentz data set prediction error (5.182 e-1). Wherein the prediction error of the ML-OSELM algorithm is obviously lower than that of other algorithms.

(3) And in the aspect of single prediction time length, the single prediction time length of the OS-ELM is the shortest. The single maximum prediction time of the TOS-ELM in the prediction of three data sets of Rossler, Chen and Lorentz is obviously longer than that of other algorithms, because the design weight is always kept to be a constant value, the output weight is severely vibrated, long-time convergence is needed, and the single prediction error time length is increased. The ML-OSELM single maximum prediction time does not exceed the TOS-ELM algorithm.

The above analysis shows that ML-OSELM has data generalization ability similar to OS-ELM and TOS-ELM and faster data training, predictive ability.

FIGS. 2-6 show the single error variation of each algorithm selected in the 5 data set predictions, where the single error calculation method is E_i＝|S_i-T_iL. From the results in the figure, the ML-OSELM algorithm is obviously better than other algorithms in the single error change fluctuation condition and the convergence of the algorithm. Wherein it can be seen from FIG. 6 that the TOS-ELM algorithm is in [1690,1700 ] during the one-time prediction of the Lorentz data set]Large prediction error is generated in the period range, and compared with the prediction error of the ML-OSELM algorithm in the regionThe difference is smaller and the ML-OSELM algorithm is at [2300,2600]The cycle range does not have a large prediction error.

Secondly, to further verify the effect of the ML-OSLEM algorithm in practical application. 3 sets of IGBT accelerated life experiments were performed. The experiment is to accelerate the failure of the IGBT module under the action of thermomechanical stress, and the experiment is characterized by the saturated voltage drop V of a collector and an emitter_CE(on)The parameter rises. Set as V_CE(on)When the initial value is more than 15%, the IGBT can be judged to be seriously aged, and the actual remaining service life of the IGBT can be judged according to the serious aging. Accelerated life test conditions and training sample size settings are shown in table 3.

TABLE 3

Experimental number	Switching frequency	Preset temperature difference	Actual temperature difference	Amount of training sample	Total sample size
						F1	5K
	100	118	500	2232
					F3	5K		50	61	500	2232
F4	1K		50	65								500	4000

The results of experimental prediction using the MLOSELM algorithm are shown in fig. 7 to 9, and the experimental error ratios are shown in table 4.

TABLE 4

Experimental number	Prediction algorithm	Period of actual failure	Predicting failure occurrence period	Prediction error
					F1	MLOSELM	1530	1520	10(RMSE＝1.034e-2)
F3	MLOSELM	1386	1378	8(RMSE＝1.022e-2)
					F4	MLOSELM	2675	2669	6(RMSE＝1.009e-2)

From the above experimental prediction results, the ML-OSELM algorithm can have an efficient and accurate prediction result under the condition of less training sample size requirements, and further, the ML-OSELM algorithm has strong application capability in non-stationary data prediction.

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.