CN102663493A

CN102663493A - Delaying nerve network used for time sequence prediction

Info

Publication number: CN102663493A
Application number: CN201210079042XA
Authority: CN
Inventors: 修春波; 张欣
Original assignee: Tianjin Polytechnic University
Current assignee: Tianjin Polytechnic University
Priority date: 2012-03-23
Filing date: 2012-03-23
Publication date: 2012-09-12

Abstract

The invention belongs to the neural network and time sequence prediction analysis field and especially relates to a delaying nerve network used for time sequence prediction. Based on a forward-type nerve network structure, through changing an excitation function of a neuron into a delaying excitation function, a forward-type delaying nerve network is constructed. A mixed method of combining a gradient descent method and a genetic algorithm is used to train a parameter of the network. The network of the invention is mainly used in the non-linear time sequence prediction analysis field.

Description

Hysteresis neural network for time series prediction

Technical Field

The invention belongs to the field of neural networks and time series prediction analysis, relates to a hysteresis neural network for time series prediction, and particularly relates to a method for realizing time series prediction analysis by constructing a hysteresis neural network and a training method.

Background

The time series prediction analysis technology has important application value in a plurality of fields such as economy, meteorology, geology, hydrology, military, medicine and the like. Scientifically and correctly predicting and analyzing various actual time sequences can generate great economic benefits and social benefits. For example, the prediction research of the wind speed time series has wide application prospects in many fields of wind power integration, meteorological monitoring and the like. Since most systems have complex nonlinear characteristics, early linear models and nonlinear models for time series analysis have certain limitations in theoretical analysis and practical application. Neural networks have unique advantages in black box modeling, and are particularly useful in predictive analysis of complex time series that produce mechanism ambiguity. However, when the traditional neural network performs predictive analysis on the nonlinear time series, the known data is often used for training the weight and the threshold, and then the future data is predicted by means of the generalization capability of the network. The memory capacity and generalization performance of the neural network are mainly determined by the connection weight of the network, the network obtains future output according to the current input, and the utilization of information such as historical change trend contained in training data is not sufficient. Thus, the predictive effect of conventional neural networks is often not ideal when the time series has a fluctuating nature of the excitation.

Therefore, the performance of the network is structurally changed, the self memory capability of the network is improved, and therefore the novel prediction network is designed to have important application value.

Drawings

FIG. 1 is a graph comparing predicted results

Disclosure of Invention

The invention aims to solve the technical problem of designing a hysteresis neural network and a training algorithm thereof, and realizing the predictive analysis of a time sequence.

The technical scheme adopted by the invention is as follows: a hysteresis neural network for time series prediction is based on a forward type neural network, and a hysteresis excitation function is used for replacing a traditional excitation function to construct a neural network model with hysteresis characteristics. The neuron selects different branches to respond based on current input and historical input. The connection weight of the network is trained by adopting a gradient descent learning algorithm, and the hysteresis parameter of the network is optimized by selecting a genetic algorithm. The hysteresis neural network has a more flexible network structure, so that the hysteresis neural network has better adaptability and is beneficial to improving the prediction performance of the neural network on the time sequence.

The invention aims to provide a hysteresis neural network for time sequence prediction, which is constructed by introducing a hysteresis excitation function into the neural network and utilizes a gradient descent method and a genetic algorithm to construct a hybrid learning algorithm so as to improve the performance of the hysteresis neural network on the time sequence prediction.

Detailed Description

The present invention will be described in further detail with reference to examples.

The traditional neural network usually selects Sigmoid function as the excitation function of its neurons, as shown in formula (1):

f(s)＝(1+exp(-cs))^-1(1)

where s is the input value of the neuron and f(s) is the output response of the neuron. The function is a monotonous smooth function in the range of (-infinity, + ∞), the output value is between (0, 1), the output of the neuron is only related to the current input value, but not to the historical input, therefore, the influence of the historical input on the current output of the neuron can not be reflected, and thus, part of useful information is lost.

In order to improve the utilization rate of the excitation function on the information contained in the data, the excitation function is changed into a hysteresis excitation function of an expression (2-3).

<math> <mrow> <mover> <mi>s</mi> <mo>·</mo> </mover> <mrow> <mo>(</mo> <mi>t</mi> <mo>-</mo> <mi>δt</mi> <mo>)</mo> </mrow> <mo>=</mo> <munder> <mi>lim</mi> <mrow> <mi>δt</mi> <mo>&RightArrow;</mo> <mn>0</mn> </mrow> </munder> <mrow> <mo>(</mo> <mi>s</mi> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>s</mi> <mrow> <mo>(</mo> <mi>t</mi> <mo>-</mo> <mi>δt</mi> <mo>)</mo> </mrow> <mo>)</mo> </mrow> <mo>/</mo> <mi>δt</mi> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow> </math>

The excitation function described by the above formula is based on a Sigmoid excitation function of a traditional neural network, and two hysteresis parameters a and b are introduced, so that the excitation function forms a hysteresis loop in a range of (- ∞, + ∞), and the response of a neuron has hysteresis characteristics.

The excitation function selects different response branches according to the change trend of the current data and the historical data, when the input of the neuron continuously rises, the excitation function selects the rising branch to perform excitation response, and when the input of the neuron continuously falls, the excitation function selects the falling branch to perform excitation response. When the input of the neuron is constant, the response value of the excitation function is also kept constant. In this way, the output response of the neuron is not only related to the current input, but also to historical input data and trends in the data. According to different historical inputs, different output results can be obtained by the same current input, and the retention characteristic of the neuron on the original state is increased by the hysteresis characteristic, so that the error change of the neuron state is reduced, and the generalization performance of the neural network is improved.

In addition, when the hysteresis characteristic parameter a is 0, the excitation function can be degenerated into a Sigmoid function, so that the hysteresis neural network can be regarded as an extension of a conventional neural network, and has more flexible control characteristics.

By using the excitation function and adopting a forward type network structure, a forward type hysteresis neural network can be constructed.

Wherein X ═ X₁，x₂，...，x_n]Is an input vector of the neural network, theta_jThreshold, y, for the jth neuron of the hidden layer_kIs the output of the jth neuron of the output layer, w⁽¹⁾ _ijIs the connection weight, w, of the ith neuron of the input layer to the jth neuron of the hidden layer⁽²⁾ _jkAnd representing the connection weight of the jth neuron of the hidden layer to the kth neuron of the output layer.

In contrast to conventional neural networks, the excitation function of each neuron of the above hysteresis network has hysteresis parameters. In order to make the network have good generalization performance, the hysteresis neural network needs to train the hysteresis parameters in addition to the weight training in the training process, which increases the complexity of network training. Therefore, the hybrid training method of the network parameters is provided, namely, a gradient descent method is adopted to train the connection weight of the neural network, and a genetic algorithm is used to adjust the hysteresis parameters of the excitation function. The specific training process is as follows:

step 1: network parameters are initialized. Randomly initializing a network weight in a range of (0, 1), randomly initializing a hysteresis parameter of an excitation function in a range of (0, 0.01), setting the maximum training time of the network as T, setting the total number of training samples as L, setting the serial number of the training samples as 1, and setting the current training time as 1.

Step 2: the ith training sample is input into the hysteresis network and the sum of the squares of the errors of the output values of the network and the actual values is calculated.

Step 3: and adjusting the network weight by adopting a gradient descent method according to the error of the ith sample. Wherein,

weight vector w from output layer to hidden layer_jkThe learning algorithm is as follows:

<math> <mrow> <msub> <mi>w</mi> <mi>jk</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>w</mi> <mi>jk</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>+</mo> <mi>Δ</mi> <msub> <mi>w</mi> <mi>jk</mi> </msub> <mo>=</mo> <msub> <mi>w</mi> <mi>jk</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>η</mi> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>E</mi> </mrow> <mrow> <mo>&PartialD;</mo> <msub> <mi>w</mi> <mi>jk</mi> </msub> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow> </math>

also, the weight vector w from the hidden layer to the input layer_ijThe learning algorithm is as follows:

<math> <mrow> <msub> <mi>w</mi> <mi>ij</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>w</mi> <mi>ij</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>+</mo> <mi>Δ</mi> <msub> <mi>w</mi> <mi>ij</mi> </msub> <mo>=</mo> <msub> <mi>w</mi> <mi>ij</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>η</mi> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>E</mi> </mrow> <mrow> <mo>&PartialD;</mo> <msub> <mi>w</mi> <mi>ij</mi> </msub> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow> </math>

step 4: l + 1. If L is less than or equal to L, returning to Step 2; otherwise go to Step 5.

Step 5: and optimizing the hysteresis parameters of the network by adopting a genetic algorithm.

(1) Defining the maximum evolution iteration number, generating an initial population for the lag parameters in a real number coding mode in the optimization range of the lag parameters, and defining the reciprocal of the error sum of squares of the training samples as a fitness function.

(2) And selecting, crossing and mutation operations as adaptive value proportion selection, single-point crossing and disturbance type mutation operations respectively, and generating new individuals by using genetic operations.

(3) And calculating the adaptive value of the individual fitness function, and reserving the individuals with high fitness to form a new population.

(4) And repeating the genetic operation until reaching a given number of evolutionary iterations, and determining the lag parameter of the optimal individual as the current lag parameter of the network.

Step 6: 1, T is T +1, if T is less than or equal to T, returning to Step 2; otherwise, ending the training.

Compared with the traditional neural network, the hysteresis neural network increases the training process of hysteresis parameters, thereby enhancing the adaptability of the network to the response of neuron input and being beneficial to improving the generalization performance of the network.

Examples

The BP neural network and the hysteresis neural network are adopted to carry out prediction analysis on the wind speed time sequence, and the prediction result is shown in figure 1. In fig. 1, "Δ" represents an actual wind speed data point, "-" represents a predicted point of the BP neural network, and "-" represents a predicted point of the lag neural network according to the present invention. The average prediction error of the hysteresis neural network is 1.27m/s, and the average prediction error of the BP neural network is 1.50m/s, so that the method can effectively realize the prediction analysis of the wind speed time sequence.

Claims

1. A time sequence prediction method based on a hysteresis neural network is characterized in that on the basis of a BP neural network model, hysteresis characteristics are introduced into the neural network by changing a traditional excitation function into a hysteresis excitation function, a forward hysteresis characteristic neural network is constructed, and a training method of connection weights and hysteresis parameters of the network is as follows:

step 1: initializing network parameters: randomly initializing a network weight in a range of (0, 1), randomly initializing a hysteresis parameter of an excitation function in a range of (0, 0.01), setting the maximum training frequency of the network as T, setting the total number of training samples as L, setting the serial number of the training samples as 1, and setting the current training frequency as 1;

step 2: inputting the ith training sample into a hysteresis network, and calculating the error square sum of the output value of the network and the actual value;

step 3: adjusting the network weight by adopting a gradient descent method according to the error of the first sample; wherein,

<math> <mrow> <msub> <mi>w</mi> <mi>jk</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>w</mi> <mi>jk</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>+</mo> <mi>Δ</mi> <msub> <mi>w</mi> <mi>jk</mi> </msub> <mo>=</mo> <msub> <mi>w</mi> <mi>jk</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>η</mi> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>E</mi> </mrow> <mrow> <mo>&PartialD;</mo> <msub> <mi>w</mi> <mi>jk</mi> </msub> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </math>

<math> <mrow> <msub> <mi>w</mi> <mi>ij</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>w</mi> <mi>ij</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>+</mo> <mi>Δ</mi> <msub> <mi>w</mi> <mi>ij</mi> </msub> <mo>=</mo> <msub> <mi>w</mi> <mi>ij</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>η</mi> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>E</mi> </mrow> <mrow> <mo>&PartialD;</mo> <msub> <mi>w</mi> <mi>ij</mi> </msub> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow> </math>

step 4: l + 1; if L is less than or equal to L, returning to Step 2; otherwise, turning to Step 5;

step 5: and (3) optimizing the hysteresis parameters of the network by adopting a genetic algorithm:

(1) defining the maximum evolution iteration times, generating an initial population for the lag parameters in a real number coding mode in the optimization range of the lag parameters, and defining the reciprocal of a training sample error square sum function as a fitness function;

(2) selecting, crossing and mutation operations as adaptive value proportion selection, single-point crossing and disturbance type mutation operations respectively, and generating new individuals by using genetic operations;

(3) calculating the adaptive value of the individual fitness function, and reserving the individuals with high fitness to form a new population;

(4) repeating the genetic operation until a given number of evolutionary iterations is reached, and determining the lag parameter of the optimal individual as the current lag parameter of the network;