CN111428201A - Prediction method for time series data based on empirical mode decomposition and feedforward neural network - Google Patents

Prediction method for time series data based on empirical mode decomposition and feedforward neural network Download PDF

Info

Publication number
CN111428201A
CN111428201A CN202010230486.3A CN202010230486A CN111428201A CN 111428201 A CN111428201 A CN 111428201A CN 202010230486 A CN202010230486 A CN 202010230486A CN 111428201 A CN111428201 A CN 111428201A
Authority
CN
China
Prior art keywords
data
data set
variable
training
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010230486.3A
Other languages
Chinese (zh)
Other versions
CN111428201B (en
Inventor
姚若侠
刘云鹤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Normal University
Original Assignee
Shaanxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi Normal University filed Critical Shaanxi Normal University
Priority to CN202010230486.3A priority Critical patent/CN111428201B/en
Publication of CN111428201A publication Critical patent/CN111428201A/en
Application granted granted Critical
Publication of CN111428201B publication Critical patent/CN111428201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Algebra (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A prediction method for time sequence data based on empirical mode decomposition and a feedforward neural network comprises the steps of data set missing value processing, single hot coding processing, principal component analysis method dimension reduction, empirical mode decomposition, data standardization processing, feedforward neural network training and test set testing. The invention adopts a principal component analysis method, a dimensionality reduction and empirical mode decomposition method, reduces the number of prediction variables through dimensionality reduction, obtains data containing most information of original data, ensures that each variable in the data obtained after dimensionality reduction does not contain repeated original data information, uses an eigenmode function to replace original time sequence data for training when a feedforward neural network is trained, inputs a data set after dimensionality reduction, reduces the number of variables, obtains accurate results, greatly reduces training time, and can be used for predicting time sequence data.

Description

Prediction method for time series data based on empirical mode decomposition and feedforward neural network
Technical Field
The invention belongs to the field of time sequence data prediction, and particularly relates to methods for EMD decomposition, PCA dimension reduction, BP neural network training and the like.
Background
Many methods are available for predicting time series data, such as vector autoregressive models, autoregressive moving average models, autoregressive integral moving average models, and regression-based methods that support vector regression linearly. These models often assume a deterministic distribution or functional form of the time series, but do not capture the complex potential nonlinear relationships. Other models, such as the gaussian process, require high computational cost to process large-scale data.
At present, most of time sequence data recorded in actual life is data without any functional characteristics, the data are directly input into a neural network for training, an optimal training model cannot be obtained, a large amount of time is consumed, only the latter two points can be predicted by using the conventional time sequence data prediction method, and a satisfactory prediction effect cannot be obtained.
Disclosure of Invention
The invention aims to solve the prediction problem and overcome the defects of the existing prediction method, and provides a prediction method for time sequence data based on empirical mode decomposition and a feedforward neural network, which has accurate prediction result, high data processing speed and high precision.
The technical scheme adopted for solving the technical problems comprises the following steps:
(1) processing missing values of a data set
A time series data set AbWhen the missing values exceed 3 continuous occurrences, deleting the rows containing the missing values, and when the number of the continuous positions of the missing values is 1-3, filling the missing values by using a mean value interpolation method in the missing value interpolation method to obtain a time sequence data set A { A } of the time sequence data set1,A2,A3In which A is1{x1,x2,...,xm|xi=(xi1,xi2,…,xiq)TI is a finite positive integer; q is the number of samples of dataset A } is the class variable in the dataset, A2{y1,y2,…,yn|yj=(yj1,yj2,…,yjq)TJ is a finite positive integer } is a variable in the data set excluding the class variable and other variables of the time series data to be predicted, A3{z1,z2,…,zp|zl=(zl1,zl2,…,zlq)TAnd l is a finite positive integer is time series data to be predicted in the data set.
(2) One-hot encoding process
For class variable A in data set1Performing one-hot coding method processing, and counting class variable A1Each of the category variables x iniThe number of class values of (1) is set for a class variable x by using a continuous natural numberiThe number of natural numbers is the number of class variable values, then the one-hot encoding process is carried out to replace the class value of (A)1Conversion into binary coded matrix B { B1,B2,…,Bm|BiIs a category variable xiData obtained by one-hot encoding }.
(3) Principal component analysis method for reducing dimension
For other variables A in the dataset2Removing the time stamp variable in the data set to obtain a residual variable A4{s1,s2,…,st|s1,s2,…,stFor other variables A2Removing the residual variable of the timestamp variable, wherein t is a finite positive integer and is less than or equal to n, and drawing a residual variable A4Is observed for the remaining variable A4Will leave a variable A4Performing dimensionality reduction processing by a principal component analysis method to obtain a matrix P { P1,p2,…,pk|p1,p2,…,pkIs a residual variable A4And (3) obtaining data after dimensionality reduction by using a principal component analysis method, wherein k is a finite positive integer and is less than or equal to t }.
(4) Empirical mode decomposition
For time series data A in data set3Performing empirical mode decomposition to obtain a matrix I { IMF) containing an eigenmode function and a margin1,IMF2,…,IMFs,r|IMFeIs an eigenmode function, e is 1,2, …, s; s is the number of eigenmode functions obtained by empirical mode decomposition, and r is the margin }.
(5) Data normalization process
Each eigenmode function IMFeAnd the margin r is spliced with binary coding matrixes B and P respectively to form e new data sets Ce{IMFeB, P and data set Cr{ r, B, P }, for all data sets CeAnd a data set CrProcessing the data by a data standardization processing method to obtain a corresponding data set De{d1,d2,…,dg|d1,d2,…,dgAs a data set CeData obtained after processing by a data normalization method, g is a finite positive integer, and a data set Dr{d1,d2,…,dg|d1,d2,…,dgAs a data set CrData is obtained after being processed by a data standardization method }, and a data set C is obtained according to the following formulaeAnd a data set CrAll data in (2) are projected to [ -1,1]Interval:
Figure BDA0002429128110000021
wherein x is a data set CeAnd a data set CrNormalized data value of each variable is [ -1,1 [)]Value of interval, xmeanAs a data set CeAnd a data set CrAverage value of data values of each variable in (1), xmaxAs a data set CeAnd a data set CrOf each variable value of (1), xminAs a data set CeAnd a data set CrThe minimum value of each variable value.
(6) Feed-forward neural network training
Data set DeAnd a data set DrAs input to a feedforward neural network, and a data set DeAnd a data set DrThe samples of (1) are divided into a training set and a testing set, and the sample ratio of the training set to the testing set is 450: 1, data set DeAnd a data set DrCorresponding eigenmode function IMFeAnd the margin r as output, the eigenmode function IMFeAnd the residual r is also divided into a training set and a testing set, and the sample ratio of the training set to the testing set is 450: and 1, sequentially inputting the training set into a feedforward neural network to train a prediction model, and stopping training when the minimum error of a training target is less than 0.001 to obtain the prediction model.
(7) Testing the test set
And the test set sequentially inputs corresponding prediction models to obtain prediction results, all the prediction results are added to obtain the sum of predicted values, and the standard deviation between the sum of the prediction results and the true value is determined.
In the principal component analysis method dimensionality reduction step (3), the principal component analysis method is as follows:
1) constructing n sample matrices
Acquiring standardized p-dimensional random vector x of original data variable, and constructing n sample matrixes
x=(x1,x2,K,xp)T
xi=(xi1,xi2,...,xin)T
Wherein n and p are finite positive integers, n > p, and the matrix x is subjected to the following normalized change:
Figure BDA0002429128110000031
Figure BDA0002429128110000032
Figure BDA0002429128110000033
wherein ZijIs xijNormalized value, xiIs the average of all elements, Sj 2Is xiRoot mean square of all elements, transformed to the normalized matrix Z.
2) Determining a matrix of correlation coefficients
The correlation coefficient matrix R is determined as follows:
Figure BDA0002429128110000034
Figure BDA0002429128110000035
wherein i and j are finite positive integers.
3) Determining unit feature vectors
Figure BDA0002429128110000036
P characteristic roots are obtained according to the following formula:
|R-λIp|=0
obtaining the value of m according to the following formula, and determining the main component:
Figure BDA0002429128110000037
where t represents the utilization of the information, for each lambdaj
Rbj=λjbj
Figure BDA0002429128110000041
Obtaining unit feature vector
Figure BDA0002429128110000042
4) Converting the normalized variables into principal components
The principal component was determined as follows:
Figure BDA0002429128110000043
5) and performing weighted summation on the obtained m principal components, wherein the weight is the variance contribution rate of each principal component, and obtaining a final evaluation value.
In the empirical mode decomposition step (4) of the present invention, the empirical mode decomposition method comprises the following steps:
1) finding out all maximum points and minimum points of the original time sequence data sequence x (t), and fitting by using a cubic spline interpolation function to form an upper envelope line and a lower envelope line of the data.
2) The mean m1(t) of the upper and lower envelopes was determined as follows:
Figure BDA0002429128110000044
where up (t) is the upper envelope formed by maxima and low (t) is the lower envelope.
3) Determining eigenmode functions
x(t)-m1(t)=h1(t)
Regarding h1(t) as a new signal x (t), repeating steps 1) and 2) until h1(t) satisfies the conditions of the eigenmode function described below.
a. The local extreme points and the zero-crossing points of the function are equal or different by 1 in the whole time range.
b. At any time, the upper envelope of the local maxima and the lower envelope of the local minima average to zero.
4) The residual component r1(t) is determined as follows
r1(t)=x(t)-h1(t)
Where h1(t) is the first eigenmode function.
5) Taking the residual component r1(t) as new original data, and repeating the steps 1) to 4) until all eigenmode functions and 1 trend term are obtained.
The invention adopts the steps of processing missing values of a time sequence data set, processing category variable acquisition in external variables of the processed data set by a unique hot coding method, carrying out dimensionality reduction processing on residual variables except category variables, time sequence data to be predicted and timestamp variables by a principal component analysis method, decomposing the time sequence data to be predicted into an eigenmode function and a residual by an empirical mode decomposition method, splicing the data sets to obtain a new data set, carrying out data standardization method processing on the obtained data set, inputting a feedforward neural network for training to obtain a prediction model, and predicting a test set. The invention processes the data set, greatly saves time in the training process of the processed experimental data, reduces the times for reaching the convergence result and improves the training precision. The method has good prediction effect on the time sequence data with complex signal characteristics.
Drawings
FIG. 1 is a flow chart of embodiment 1 of the present invention.
Detailed Description
The present invention will be described in further detail below with reference to the drawings and examples, but the present invention is not limited to the embodiments described below.
Example 1
In fig. 1, the steps of the prediction method for time series data based on empirical mode decomposition and feedforward neural network of the embodiment are as follows:
(1) processing missing values of a data set
A time series data set AbWhen the missing values exceed 3 continuous occurrences, deleting the rows containing the missing values, and when the number of the continuous positions of the missing values is 1-3, filling the missing values by using a mean value interpolation method in the missing value interpolation method to obtain a time sequence data set A { A } of the time sequence data set1,A2,A3In which A is1{x1,x2,...,xm|xi=(xi1,xi2,…,xiq)TI is a finite positive integer; q is the number of samples of dataset A } is the class variable in the dataset, A2{y1,y2,…,yn|yj=(yj1,yj2,…,yjq)TJ is a finite positive integer } is a variable in the data set excluding the class variable and other variables of the time series data to be predicted, A3{z1,z2,…,zp|zl=(zl1,zl2,…,zlq)TAnd l is a finite positive integer is time series data to be predicted in the data set.
The embodiment is realized by comparing a time sequence data set AbThe missing value is analyzed and processed, the precision of the feedforward neural network training is improved, and a more accurate training result is obtained.
(2) One-hot encoding process
For class variable A in data set1Performing one-hot coding method processing, and counting class variable A1Each of the category variables x iniThe number of class values of (1) is set for a class variable x by using a continuous natural numberiThe number of natural numbers is the number of class variable values, then the one-hot encoding process is carried out to replace the class value of (A)1Conversion into binary coded matrix B { B1,B2,…,Bm|BiIs a category variable xiData obtained by one-hot encoding }.
Since the embodiment performs the one-hot encoding method processing on the category variable, each encoded value has only one valid bit, and the positions of the valid bits are different, thereby ensuring that the data are different from each other. The distance between the variable values of each category is recalculated by the data processed by the one-hot coding method, and the processed data can be used for the training of the feedforward neural network.
(3) Principal component analysis method for reducing dimension
For other variables A in the dataset2Removing the time stamp variable in the data set to obtain a residual variable A4{s1,s2,…,st|s1,s2,…,stFor other variables A2Removing the residual variable of the timestamp variable, wherein t is a finite positive integer and is less than or equal to n, and drawing a residual variable A4Is observed for the remaining variable A4Will leave a variable A4Performing dimensionality reduction processing by a principal component analysis method to obtain a matrix P { P1,p2,…,pk|p1,p2,…,pkIs a residual variable A4And (3) obtaining data after dimensionality reduction by using a principal component analysis method, wherein k is a finite positive integer and is less than or equal to t }.
The principal component analysis method of this example is as follows:
1) constructing n sample matrices
Acquiring standardized p-dimensional random vector x of original data variable, and constructing n sample matrixes
x=(x1,x2,...,xp)T
xi=(xi1,xi2,...,xin)T
Where n and p are finite positive integers, n > p, and the matrix x is normalized as follows.
Figure BDA0002429128110000061
Figure BDA0002429128110000062
Figure BDA0002429128110000066
Wherein ZijIs xijNormalized value, xiIs the average of all elements, Sj 2Is xiRoot mean square of all elements, transformed to the normalized matrix Z.
2) Determining a matrix of correlation coefficients
The correlation coefficient matrix R is determined as follows:
Figure BDA0002429128110000063
Figure BDA0002429128110000064
wherein i, j are finite positive integers.
3) Determining unit feature vectors
Figure BDA0002429128110000065
P characteristic roots are obtained according to the following formula:
|R-λIp|=0
obtaining the value of m according to the following formula, and determining the main component:
Figure BDA0002429128110000071
where t represents the utilization of the information, for each lambdaj
Rbj=λjbj
Figure BDA0002429128110000072
Obtaining unit feature vector
Figure BDA0002429128110000073
4) Converting the normalized variables into principal components
The principal component was determined as follows:
Figure BDA0002429128110000074
5) and performing weighted summation on the obtained m principal components, wherein the weight is the variance contribution rate of each principal component, and obtaining a final evaluation value.
According to the principal component analysis method, the variables which are mutually connected are found, the number of the prediction variables is reduced through dimensionality reduction, the obtained data contain most information of original data, it is guaranteed that each variable in the data obtained after dimensionality reduction does not contain repeated original data variable information, when a feedforward neural network is trained, a data set after dimensionality reduction is input, an accurate result is obtained, the number of the variables is reduced, training time is greatly reduced, and training precision is improved.
(4) Empirical mode decomposition
For time series data A in data set3Performing empirical mode decomposition to obtain a matrix I { IMF) containing an eigenmode function and a margin1,IMF2,…,IMFs,r|IMFeIs an eigenmode function, e is 1,2, …, s; s is the number of eigenmode functions obtained by empirical mode decomposition, and r is the margin }.
The empirical mode decomposition method of the embodiment comprises the following steps:
1) finding out all maximum points and minimum points of the original time sequence data sequence x (t), and fitting by using a cubic spline interpolation function to form an upper envelope line and a lower envelope line of the data.
2) The mean m1(t) of the upper and lower envelopes was determined as follows:
Figure BDA0002429128110000075
where up (t) is the upper envelope formed by maxima and low (t) is the lower envelope.
3) Determining eigenmode functions
x(t)-m1(t)=h1(t)
Regarding h1(t) as a new signal x (t), repeating steps 1) and 2) until h1(t) satisfies the conditions of the eigenmode function described below.
a. The local extreme points and the zero-crossing points of the function are equal or different by 1 in the whole time range.
b. At any time, the upper envelope of the local maxima and the lower envelope of the local minima average to zero.
4) The residual component r1(t) is determined as follows
r1(t)=x(t)-h1(t)
Where h1(t) is the first eigenmode function.
5) Taking the residual component r1(t) as new original data, and repeating the steps 1) to 4) until all eigenmode functions and 1 trend term are obtained.
For time series data A in data set3And performing empirical mode decomposition to obtain the characteristic of single eigenmode function, and in the neural network training, training by using the eigenmode function instead of original time sequence data, so that the training time is saved, and the training precision is improved.
(5) Data normalization process
Each eigenmode function IMFeAnd the margin r is spliced with binary coding matrixes B and P respectively to form e new data sets Ce{IMFeB, P and data set Cr{ r, B, P }, for all data sets CeAnd a data set CrProcessing the data by a data standardization processing method to obtain a corresponding data set De{d1,d2,…,dg|d1,d2,…,dgAs a data set CeBy means of data standardizationData obtained after processing by the method, g is a finite positive integer, and a data set Dr{d1,d2,…,dg|d1,d2,…,dgAs a data set CrData is obtained after being processed by a data standardization method }, and a data set C is obtained according to the following formulaeAnd a data set CrAll data in (2) are projected to [ -1,1]An interval.
Figure BDA0002429128110000081
Wherein x is*As a data set CeAnd a data set CrNormalized data value of each variable is [ -1,1 [)]Value of interval, xmeanAs a data set CeAnd a data set CrAverage value of data values of each variable in (1), xmaxAs a data set CeAnd a data set CrOf each variable value of (1), xminAs a data set CeAnd a data set CrThe minimum value of each variable value.
The data set is processed by the standardized method through the steps, and the processed data set is used for feedforward neural network training, so that the convergence speed and the training precision of the model can be improved.
(6) Feed-forward neural network training
Data set DeAnd a data set DrAs input to a feedforward neural network, and a data set DeAnd a data set DrThe samples of (1) are divided into a training set and a testing set, and the sample ratio of the training set to the testing set is 450: 1, data set DeAnd a data set DrCorresponding eigenmode function IMFeAnd the margin r as output, the eigenmode function IMFeAnd the residual r is also divided into a training set and a testing set, and the sample ratio of the training set to the testing set is 450: and 1, sequentially inputting the training set into a feedforward neural network to train a prediction model, and stopping training when the minimum error of a training target is less than 0.001 to obtain the prediction model.
The processed data set in the embodiment is input into a feedforward neural network to obtain all prediction models, so that the training time is saved in the training process, the training times required for convergence are reduced, and the training precision is improved.
(7) Testing the test set
And the test set sequentially inputs corresponding prediction models to obtain prediction results, all the prediction results are added to obtain the sum of predicted values, and the standard deviation between the sum of the prediction results and the true value is determined.

Claims (3)

1. A prediction method for time series data based on empirical mode decomposition and a feedforward neural network is characterized by comprising the following steps:
(1) processing missing values of a data set
A time series data set AbWhen the missing values exceed 3 continuous occurrences, deleting the rows containing the missing values, and when the number of the continuous positions of the missing values is 1-3, filling the missing values by using a mean value interpolation method in the missing value interpolation method to obtain a time sequence data set A { A } of the time sequence data set1,A2,A3In which A is1{x1,x2,...,xm|xi=(xi1,xi2,…,xiq)TI is a finite positive integer; q is the number of samples of dataset A } is the class variable in the dataset, A2{y1,y2,…,yn|yj=(yj1,yj2,…,yjq)TJ is a finite positive integer } is a variable in the data set excluding the class variable and other variables of the time series data to be predicted, A3{z1,z2,…,zp|zl=(zl1,zl2,…,zlq)TL is a finite positive integer is time sequence data needing to be predicted in the data set;
(2) one-hot encoding process
For class variable A in data set1Performing one-hot coding method processing, and counting class variable A1Each of the category variables x iniThe number of class values of (1) is set for a class variable x by using a continuous natural numberiThe number of natural numbers is the classThe number of the class variable values is then subjected to the one-hot encoding process to convert the class variable A into the class variable A1Conversion into binary coded matrix B { B1,B2,…,Bm|BiIs a category variable xiData obtained by one-hot encoding };
(3) principal component analysis method for reducing dimension
For other variables A in the dataset2Removing the time stamp variable in the data set to obtain a residual variable A4{s1,s2,…,st|s1,s2,…,stFor other variables A2Removing the residual variable of the timestamp variable, wherein t is a finite positive integer and is less than or equal to n, and drawing a residual variable A4Is observed for the remaining variable A4Will leave a variable A4Performing dimensionality reduction processing by a principal component analysis method to obtain a matrix P { P1,p2,…,pk|p1,p2,…,pkIs a residual variable A4Using a principal component analysis method to reduce the dimension to obtain data, wherein k is a finite positive integer and is less than or equal to t };
(4) empirical mode decomposition
For time series data A in data set3Performing empirical mode decomposition to obtain a matrix I { IMF) containing an eigenmode function and a margin1,IMF2,…,IMFs,r|IMFeIs an eigenmode function, e is 1,2, …, s; s is the number of eigenmode functions obtained by empirical mode decomposition, and r is the margin };
(5) data normalization process
Each eigenmode function IMFeAnd the margin r is spliced with binary coding matrixes B and P respectively to form e new data sets Ce{IMFeB, P and data set Cr{ r, B, P }, for all data sets CeAnd a data set CrProcessing the data by a data standardization processing method to obtain a corresponding data set De{d1,d2,…,dg|d1,d2,…,dgAs a data set CeData obtained after processing by a data normalization method, g is a finite positive integer, and a data set Dr{d1,d2,…,dg|d1,d2,…,dgAs a data set CrData is obtained after being processed by a data standardization method }, and a data set C is obtained according to the following formulaeAnd a data set CrAll data in (2) are projected to [ -1,1]Interval:
Figure FDA0002429128100000021
wherein x is*As a data set CeAnd a data set CrNormalized data value of each variable is [ -1,1 [)]Value of interval, xmeanAs a data set CeAnd a data set CrAverage value of data values of each variable in (1), xmaxAs a data set CeAnd a data set CrOf each variable value of (1), xminAs a data set CeAnd a data set CrThe minimum value of each variable value;
(6) feed-forward neural network training
Data set DeAnd a data set DrAs input to a feedforward neural network, and a data set DeAnd a data set DrThe samples of (1) are divided into a training set and a testing set, and the sample ratio of the training set to the testing set is 450: 1, data set DeAnd a data set DrCorresponding eigenmode function IMFeAnd the margin r as output, the eigenmode function IMFeAnd the residual r is also divided into a training set and a testing set, and the sample ratio of the training set to the testing set is 450: 1, sequentially inputting a training set into a feedforward neural network to perform prediction model training, and stopping training when the minimum error of a training target is less than 0.001 to obtain a prediction model;
(7) testing the test set
And the test set sequentially inputs corresponding prediction models to obtain prediction results, all the prediction results are added to obtain the sum of predicted values, and the standard deviation between the sum of the prediction results and the true value is determined.
2. The prediction method based on empirical mode decomposition and feedforward neural network time series data according to claim 1, wherein in the principal component analysis method dimension reduction step (3), the principal component analysis method is:
(1) constructing n sample matrices
Acquiring standardized p-dimensional random vector x of original data variable, and constructing n sample matrixes
x=(x1,x2,K,xp)T
xi=(xi1,xi2,...,xin)T
Wherein n and p are finite positive integers, n > p, and the matrix x is subjected to the following normalized change:
Figure FDA0002429128100000022
Figure FDA0002429128100000023
Figure FDA0002429128100000024
wherein ZijIs xijNormalized value, xiIs the average of all elements, Sj 2Is xiConverting the root mean square of all elements to obtain a normalized matrix Z;
(2) determining a matrix of correlation coefficients
The correlation coefficient matrix R is determined as follows:
Figure FDA0002429128100000031
Figure FDA0002429128100000032
wherein i and j are finite positive integers;
(3) determining unit feature vectors
Figure FDA0002429128100000033
P characteristic roots are obtained according to the following formula:
|R-λIp|=0
obtaining the value of m according to the following formula, and determining the main component:
Figure FDA0002429128100000034
where t represents the utilization of the information, for each lambdaj
Rbj=λjbj
Figure FDA0002429128100000035
Obtaining unit feature vector
Figure FDA0002429128100000036
(4) Converting the normalized variables into principal components
The principal component was determined as follows:
Figure FDA0002429128100000037
(5) and performing weighted summation on the obtained m principal components, wherein the weight is the variance contribution rate of each principal component, and obtaining a final evaluation value.
3. The method for predicting time series data based on empirical mode decomposition and feedforward neural network as claimed in claim 1, wherein in the step (4) of empirical mode decomposition, the steps of empirical mode decomposition are as follows:
(1) finding out all maximum points and minimum points of the original time sequence data sequence x (t), and fitting by using a cubic spline interpolation function to form an upper envelope line and a lower envelope line of the data;
(2) the mean m1(t) of the upper and lower envelopes was determined as follows:
Figure FDA0002429128100000041
wherein up (t) is an upper envelope formed by a maximum value, and low (t) is a lower envelope;
(3) determining eigenmode functions
x(t)-m1(t)=h1(t)
Considering h1(t) as a new signal x (t), repeating steps (1), (2) until h1(t) satisfies the following conditions for eigenmode functions:
1) in the function, the number of local extreme points and zero-crossing points is equal to or different by 1 in the whole time range;
2) at any moment, the upper envelope line of the local maximum value and the lower envelope line of the local minimum value are averagely zero;
(4) the residual component r1(t) is determined as follows
r1(t)=x(t)-h1(t)
Where h1(t) is the first eigenmode function; .
(5) And (5) taking the residual component r1(t) as new original data, and repeating the steps (1) to (4) until all eigenmode functions and 1 trend term are obtained.
CN202010230486.3A 2020-03-27 2020-03-27 Prediction method for time series data based on empirical mode decomposition and feedforward neural network Active CN111428201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010230486.3A CN111428201B (en) 2020-03-27 2020-03-27 Prediction method for time series data based on empirical mode decomposition and feedforward neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010230486.3A CN111428201B (en) 2020-03-27 2020-03-27 Prediction method for time series data based on empirical mode decomposition and feedforward neural network

Publications (2)

Publication Number Publication Date
CN111428201A true CN111428201A (en) 2020-07-17
CN111428201B CN111428201B (en) 2023-04-11

Family

ID=71555519

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010230486.3A Active CN111428201B (en) 2020-03-27 2020-03-27 Prediction method for time series data based on empirical mode decomposition and feedforward neural network

Country Status (1)

Country Link
CN (1) CN111428201B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633333A (en) * 2020-12-11 2021-04-09 广州致新电力科技有限公司 Method for identifying partial discharge defects
CN113326472A (en) * 2021-05-28 2021-08-31 东北师范大学 Pattern extraction and evolution visual analysis method based on time sequence multivariable data
CN115061196A (en) * 2022-08-17 2022-09-16 成都川油瑞飞科技有限责任公司 Micro-seismic signal identification method based on empirical mode decomposition (IMF) guidance
CN116523388A (en) * 2023-04-17 2023-08-01 无锡雪浪数制科技有限公司 Data-driven quality modeling method based on industrial Internet platform
CN117131369A (en) * 2023-10-27 2023-11-28 福建福昇消防服务集团有限公司 Data processing method and system of intelligent safety management and emergency rescue integrated station
CN117668531A (en) * 2023-12-07 2024-03-08 无锡中科光电技术有限公司 EMMD-BP neural network atmospheric pollutant forecasting method based on principal component analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004053659A2 (en) * 2002-12-10 2004-06-24 Stone Investments, Inc Method and system for analyzing data and creating predictive models
WO2016091017A1 (en) * 2014-12-09 2016-06-16 山东大学 Extraction method for spectral feature cross-correlation vector in hyperspectral image classification
CN106126896A (en) * 2016-06-20 2016-11-16 中国地质大学(武汉) The mixed model wind speed forecasting method learnt based on empirical mode decomposition and the degree of depth and system
CN107292453A (en) * 2017-07-24 2017-10-24 国网江苏省电力公司电力科学研究院 A kind of short-term wind power prediction method based on integrated empirical mode decomposition Yu depth belief network
CN110619384A (en) * 2019-08-13 2019-12-27 浙江工业大学 PM2.5 concentration value prediction method based on neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004053659A2 (en) * 2002-12-10 2004-06-24 Stone Investments, Inc Method and system for analyzing data and creating predictive models
WO2016091017A1 (en) * 2014-12-09 2016-06-16 山东大学 Extraction method for spectral feature cross-correlation vector in hyperspectral image classification
CN106126896A (en) * 2016-06-20 2016-11-16 中国地质大学(武汉) The mixed model wind speed forecasting method learnt based on empirical mode decomposition and the degree of depth and system
CN107292453A (en) * 2017-07-24 2017-10-24 国网江苏省电力公司电力科学研究院 A kind of short-term wind power prediction method based on integrated empirical mode decomposition Yu depth belief network
CN110619384A (en) * 2019-08-13 2019-12-27 浙江工业大学 PM2.5 concentration value prediction method based on neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
卢国斌等: "基于EMD-MFOA-ELM的瓦斯涌出量时变序列预测研究", 《中国安全生产科学技术》 *
张龙等: "基于时序模型和自联想神经网络的齿轮故障程度评估", 《振动与冲击》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633333A (en) * 2020-12-11 2021-04-09 广州致新电力科技有限公司 Method for identifying partial discharge defects
CN113326472A (en) * 2021-05-28 2021-08-31 东北师范大学 Pattern extraction and evolution visual analysis method based on time sequence multivariable data
CN115061196A (en) * 2022-08-17 2022-09-16 成都川油瑞飞科技有限责任公司 Micro-seismic signal identification method based on empirical mode decomposition (IMF) guidance
CN115061196B (en) * 2022-08-17 2022-11-15 成都川油瑞飞科技有限责任公司 Micro-seismic signal identification method based on empirical mode decomposition (IMF) guidance
CN116523388A (en) * 2023-04-17 2023-08-01 无锡雪浪数制科技有限公司 Data-driven quality modeling method based on industrial Internet platform
CN116523388B (en) * 2023-04-17 2023-11-10 无锡雪浪数制科技有限公司 Data-driven quality modeling method based on industrial Internet platform
CN117131369A (en) * 2023-10-27 2023-11-28 福建福昇消防服务集团有限公司 Data processing method and system of intelligent safety management and emergency rescue integrated station
CN117131369B (en) * 2023-10-27 2023-12-22 福建福昇消防服务集团有限公司 Data processing method and system of intelligent safety management and emergency rescue integrated station
CN117668531A (en) * 2023-12-07 2024-03-08 无锡中科光电技术有限公司 EMMD-BP neural network atmospheric pollutant forecasting method based on principal component analysis

Also Published As

Publication number Publication date
CN111428201B (en) 2023-04-11

Similar Documents

Publication Publication Date Title
CN111428201B (en) Prediction method for time series data based on empirical mode decomposition and feedforward neural network
CN108304623B (en) Probability load flow online calculation method based on stack noise reduction automatic encoder
CN110726898B (en) Power distribution network fault type identification method
CN112632794A (en) Power grid reliability evaluation method based on cross entropy parameter subset simulation optimization
CN112287605B (en) Power flow checking method based on graph convolution network acceleration
CN112766537B (en) Short-term electric load prediction method
CN111008726A (en) Class image conversion method in power load prediction
CN112784920A (en) Cloud-side-end-coordinated dual-anti-domain self-adaptive fault diagnosis method for rotating part
CN114006870A (en) Network flow identification method based on self-supervision convolution subspace clustering network
CN117131449A (en) Data management-oriented anomaly identification method and system with propagation learning capability
CN115115090A (en) Wind power short-term prediction method based on improved LSTM-CNN
CN114626487B (en) Linear transformation relation checking method based on random forest classification algorithm
CN113627685B (en) Wind driven generator power prediction method considering wind power internet load limit
CN117131022B (en) Heterogeneous data migration method of electric power information system
CN111476402A (en) Wind power generation capacity prediction method coupling meteorological information and EMD technology
CN108459585B (en) Power station fan fault diagnosis method based on sparse local embedded deep convolutional network
CN115883424A (en) Method and system for predicting traffic data between high-speed backbone networks
CN115358473A (en) Power load prediction method and prediction system based on deep learning
CN111476408B (en) Power communication equipment state prediction method and system
CN114545066A (en) Non-invasive load monitoring model polymerization method and system
CN113821419A (en) Cloud server aging prediction method based on SVR and Gaussian function
CN114169452A (en) Information loss prevention method and system for industrial big data feature extraction
CN113779109B (en) Electric power data preprocessing method based on context environment
CN113487080B (en) Wind speed dynamic scene generation method, system and terminal based on wind speed classification
CN114034966A (en) Power transmission line fault identification method and device based on support vector machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant