CN109002917A

CN109002917A - Total output of grain multidimensional time-series prediction technique based on LSTM neural network

Info

Publication number: CN109002917A
Application number: CN201810775772.0A
Authority: CN
Inventors: 郑勇; 任万明; 王钧; 王统敏; 毛向明; 高波; 李钊; 邵青峰; 陈通; 王瑞霜; 王磊
Original assignee: Shandong Yi Yun Information Technology Co Ltd; Shandong Agricultural Information Center
Current assignee: Shandong Yi Yun Information Technology Co., Ltd
Priority date: 2018-07-13
Filing date: 2018-07-13
Publication date: 2018-12-14

Abstract

The present invention discloses a kind of total output of grain multidimensional time-series prediction technique based on LSTM neural network, based on LSTM neural network, multiple dimension variables using the yield data of a variety of agricultural product as LSTM neural network input data, the multidimensional time-series prediction model based on LSTM neural network is established, because it is contemplated that the yield datas of a variety of agricultural product, and using the LSTM being superimposed come processing sequence data information, avoid gradient disappearance problem, prediction effect is good, and precision is high, and applicability is good.

Description

Total output of grain multidimensional time-series prediction technique based on LSTM neural network

Technical field

The present invention relates to a kind of total output of grain prediction techniques, are a kind of grain based on LSTM neural network specifically Total output multidimensional time-series prediction technique is eaten, Grain Crop Yield Prediction field is belonged to.

Background technique

At present for the research in Grain Crop Yield Prediction field, formd a variety of prediction techniques, as exponential smoothing model, Grey method, ARIMA model, remote sensing technology prediction model, statistical dynamics growth model, Meteorological Output prediction model, mind Through network model etc..Through investigating, these methods only use the historical data of grain yield to the grain in following a period of time mostly Food yield is predicted, and the total output of grain is often influenced by other agricultural product, and such as cultivated area, certain situation is sowed The farmland area just opposite reduction that other crops are planted when the farmland area increase of corn is planted, corresponding crop yield is with regard to phase To reduction.Therefore, total output of grain is predicted using a variety of agricultural product as multiple dimensions, accuracy is relatively high.

With the rapid development of depth learning technology, Recognition with Recurrent Neural Network (RNN) has been obtained extensively in multiple fields Application, and achieve good effect.And Recognition with Recurrent Neural Network is most common tool in processing time series problem, because There is memory function for it, the result before can remembeing several times.But its memory function is limited, can not remember for an infinity Before as a result, can not solve the problems, such as long-distance dependence in turn.

Summary of the invention

The technical problem to be solved in the present invention is to provide a kind of total output of grain multi-dimensional times based on LSTM neural network Sequence prediction method establishes the multidimensional time-series prediction model based on LSTM neural network, using the yield of a variety of agricultural product Multiple dimension variables of the data as input data, prediction effect is good, and precision is high.

In order to solve the technical problem, the technical solution adopted by the present invention is that: the grain based on LSTM neural network is total Yield multidimensional time-series prediction technique, comprising the following steps: S01), input data is pre-processed, when taking over specific Between training set of a variety of agricultural output data as LSTM neural network in section, before being trained to LSTM neural network, The pretreatment that the data of training set are normalized first, if training set is matrix X,

By becoming square after normalized Battle array:

WhereinI=1,2 ..., n, j=1,2 ..., 24, here For matrix X^RThe minimum value of jth column variable,For matrix X^RThe maximum value of jth column variable,For matrix X^RJth column variable it is very poor, by transformation after, gained agricultural output numerical value is in 0- Between 1, normalized is completed；

S02), LSTM neural network framework is designed, LSTM neural network framework includes input layer, LSTM layers, Dropout Pretreated data are passed through in layer, LSTM layers, Dropout layers, Dense layers and output layer, input layer input, then in turn through LSTM layers, Dropout layers, LSTM layers, Dropout layers, Dense layers of interative computation, finally export prediction result；Wherein LSTM Layer calculates input data using LSTM function；Dropout layers train the model over-fitting come in order to prevent, it is in mould Temporarily disconnect the neuron of specified quantity in type training process when undated parameter at random；Dense layers are full articulamentums, and operation is Output=activation (dot (input, weight)+bias), wherein activation is activation primitive, and dot is matrix Multiplication operation, input are that Dense layers of input and the data of input layer successively pass through LSTM layers, Dropout layers, LSTM The output valve that layer, Dropout layers of operation obtain, weight are weight matrix, and bias is bias, and output is exactly last defeated Prediction result out.

Further, the LSTM layers of calculating using cumulative linear forms come processing sequence data information, inside LSTM layers Process is；

f_t=σ (W_f·[h_t-1,X_t]+b_f) (1),

i_t=σ (W_i·[h_t-1,X_t]+b_i) (2),

o_t=σ (W_o·[h_t-1,X_t]+b_o) (5),

h_t=o_t·tanh(c_t) (6),

Wherein, f_tIt indicates to forget door, σ indicates sigmoid function, W_fIndicate the weight matrix of forgetting door, h_t-1Indicate upper one The output at moment, X_tIndicate the input data at current time, [h_t-1,X_t] indicate h_t-1、X_tIt is spliced into the horizontal direction one long Vector, b_fIndicate the bias of forgetting door, i_tIndicate input gate, W_iIndicate the weight matrix of input gate, b_iIndicate the inclined of input gate Value is set,Indicate location mode currently entered, W_cIndicate the weight matrix of location mode currently entered, b_cIndicate current defeated The bias of the location mode entered, c_tIndicate the location mode at current time, it is by forgetting door f_tMultiplied by the unit of last moment State c_t-1, add input gate i_tMultiplied by location mode currently enteredo_tIndicate out gate, W_oIndicate the weight of out gate Matrix, b_oIndicate the bias of out gate, h_tIndicate final output, it is by out gate o_tMultiplied by current time location mode c_t Tanh functional value.

Further, activation primitive is ReLU function.

Further, LSTM layers of input data is 3D that shape is (samples, timesteps, input_dim) Amount, samples indicate that sample size, timesteps indicate the length of time window, and input_dim indicates the dimension of input data Degree.

Further, when taking training set, using the time as unit, a variety of agricultural output data conducts for the M of taking over The training set of LSTM neural network, M are positive integer.

Beneficial effects of the present invention: the present invention is based on LSTM neural networks, using the yield data conduct of a variety of agricultural product Multiple dimension variables of LSTM neural network input data establish the multidimensional time-series based on LSTM neural network and predict mould Type, because it is contemplated that the yield datas of a variety of agricultural product, and ladder is avoided come processing sequence data information using the LSTM of superposition Disappearance problem is spent, prediction effect is good, and precision is high, and applicability is good.

Detailed description of the invention

Fig. 1 is the schematic diagram of LSTM door；

Fig. 2 is the schematic diagram of LSTM neural network framework；

Fig. 3 is the schematic diagram for testing test penalty values and training penalty values in 1, and wherein curve A indicates training penalty values, bent Line B indicates test penalty values；

Fig. 4 is the schematic diagram for testing grain yield truthful data and prediction data comparing result in 1；

Fig. 5 is the schematic diagram for testing test penalty values and training penalty values in 2, and wherein curve C indicates training penalty values, bent Line D indicates test penalty values；

Fig. 6 is the schematic diagram for testing grain yield truthful data and prediction data comparing result in 2；

Fig. 7 is the schematic diagram for testing test penalty values and training penalty values in 3, and wherein curve E indicates training penalty values, bent Line F indicates test penalty values；

Fig. 8 is the schematic diagram for testing grain yield truthful data and prediction data comparing result in 3.

Specific embodiment

The present invention is further illustrated in the following with reference to the drawings and specific embodiments.

The present embodiment discloses a kind of total output of grain multidimensional time-series prediction technique based on LSTM neural network, including Following steps:

S01), the pretreatment of input data

In engineering practice, the data that we obtain are often imperfect, Noise, inconsistent, duplicate, and numerous Deep learning algorithm in the quality of input data determine the quality of model in training process again, therefore we are using data Before being trained to model, data need to be pre-processed.The process of data prediction is not usually fixation, it is usually because appointing Business and the difference of data set attribute and it is different.

In deep learning field, before being trained using training the set pair analysis model, it usually needs data are normalized pre- Processing operation accelerates convergence rate when model training primarily to data are limited in [0,1] range, is convenient for data Processing.Normalized operation has equally been carried out to training set according to the feature of training set in the present embodiment.Normalized There are many mode, the normalized calculation formula that we use in the present embodiment is as follows.

If data set is matrix X, then

By becoming matrix after normalized:

S02), LSTM neural network framework is designed

In the training process of deep learning model, the design of the network architecture is played a crucial role, it is determined The quality of model directly affects the prediction result of data.The quick of deep learning is realized in the present embodiment using keras frame Prototype because keras provides multiple network frame for selection by the user, user can based on existing frame, according to The new web original of the Demand Design of oneself；Meanwhile constructing that code required when various networks is fewer, desin speed is fast.

The LSTM network architecture of stacking is selected to predict total output of grain in the present embodiment, LSTM neural network framework As shown in Fig. 2, including input layer, LSTM layers, Dropout layers, LSTM layers, Dropout layers, Dense layers and output layer, input layer Pretreated data are passed through in input, then in turn through LSTM layers, Dropout layers, LSTM layers, Dropout layers, Dense layers Interative computation finally exports prediction result；Wherein input data is calculated using LSTM function for LSTM layers；Dropout layers The model over-fitting come is trained in order to prevent, it temporarily disconnects specified quantity when undated parameter during model training at random Neuron；Dense layers are full articulamentums, operation be output=activation (dot (input, weight)+ Bias), wherein activation is activation primitive, and dot is matrix multiple operation, and input is Dense layers of input and defeated Enter data and successively pass through LSTM layers, Dropout layers, LSTM layers, the obtained output valve of Dropout layers of operation, weight is power Weight matrix, bias is bias, and output is exactly the prediction result finally exported.Dot (input, weight) indicates input number It is multiplied according to weight matrix.

In the present embodiment, LSTM layers of input data is the 3D that shape is (samples, timesteps, input_dim) Tensor, samples indicate that sample size, timesteps indicate the length of time window, and input_dim indicates input data Dimension.Activation primitive is ReLU function.

In the present embodiment, the LSTM network architecture uses two LSTM and is overlapped, including two LSTM layers, each LSTM Layer is followed by one Dropout layers, and each Dropout layers respectively updates the LSTM layer of the front during model training The neuron for temporarily disconnecting specified quantity when parameter at random trains the model over-fitting come for preventing.

As shown in Figure 1, being the schematic diagram of LSTM door, X indicates input data, and h indicates output data, and C indicates unit The value of state.The content of location mode C is mainly controlled by forgeing door and input gate.The location mode content at current timeIt is by part last moment location mode content C_t-1The input X current with part_tComposition includes last moment location mode The number of content is controlled by forgetting door, how much is controlled by input gate comprising currently entered.h_tIndicate current output valve, It is a part of location mode content, its value is controlled by out gate.LSTM layers of calculating process are as follows:

f_t=σ (W_f·[h_t-1,X_t]+b_f) (1),

i_t=σ (W_i·[h_t-1,X_t]+b_i) (2),

o_t=σ (W_o·[h_t-1,X_t]+b_o) (5),

h_t=o_t·tanh(c_t) (6),

Wherein, f_tIt indicates to forget door, σ indicates sigmoid function, W_fIndicate the weight matrix of forgetting door, h_t-1Indicate upper one The output at moment, X_tIndicate the input data at current time, [h_t-1,X_t] indicate h_t-1、X_tIt is spliced into the horizontal direction one long Vector, b_fIndicate the bias of forgetting door, i_tIndicate input gate, W_iIndicate the weight matrix of input gate, b_iIndicate the inclined of input gate Value is set,Indicate location mode currently entered, W_cIndicate the weight matrix of location mode currently entered, b_cIndicate current defeated The bias of the location mode entered, c_tIndicate the location mode at current time, it is by forgetting door f_tMultiplied by the unit of last moment State c_t-1, add input gate i_tMultiplied by location mode currently enteredo_tIndicate out gate, W_oIndicate the weight of out gate Matrix, b_oIndicate the bias of out gate, h_tIndicate final output, it is by out gate o_tMultiplied by current time cell-like State c_tTanh functional value.

By the training calculating process of LSTM it is found that LSTM layers using cumulative linear forms come processing sequence data information, Gradient disappearance problem is avoided, can also be acquired apart from longer contextual information.

In the present embodiment, when taking training set, using the time as unit, the plurality of cereals yield data conduct for the M of taking over The training set of LSTM neural network, M are positive integer.

The prediction effect of this prediction technique, accuracy are analyzed below with reference to specific experiment.

Experiment 1 carries out multidimensional time-series prediction based on LSTM neural network

In this experiment, total output of grain is predicted using multidimensional variable based on LSTM neural network, uses soybean, small Wheat, corn, paddy, grain multiple dimension variables of the yield data as input data, according to past historical data to not The total output of grain come is predicted.Chosen in this experiment -1999 years 1949 agricultural product (soybean, wheat, corn, paddy, Grain) yield data is as training dataset, and the agricultural output data of selection 2000 to 2005 are as test data set. Partial data format is as shown in table 1.

1 soybean of table, wheat, corn, paddy, grain yield partial data table

Performance metric is carried out to model as loss function using mean square deviation (MSE) in this experiment.When Fig. 3 shows multidimensional Between test and training loss curve graph during sequence LSTM model training.

Fig. 3 shows the output penalty values of model test data and training data at the end of each iteration.It is instructed in model After white silk, which is 0.00914 to final mean square deviation (MSE) value of test data set, achieves higher accuracy.

Test data using trained model to 2000 to 2005 is predicted, and is returned to predicted value is counter One changes, and the prediction data after reduction is illustrated in fig. 4 shown below with truthful data comparing result.

As shown in Figure 4, the multidimensional time-series prediction model based on LSTM neural network has preferably test data Fitting effect, prediction data and truthful data and its prediction error are as shown in table 2.

- 2005 years 2 2000 years grain yield truthful datas of table and prediction data compare

Time	True value	Predicted value	Error rate
				2000	3837.7	3908.106	1.835%
2001	3720.6	3657.555	1.694%
				2002	3292.7	3378.649	2.610%
2003	3435.5	3402.470	0.961%
				2004	3516.7	3463.012	1.527%
2005	3917.4	3985.137	1.729%

Multidimensional time-series error prediction model very little based on LSTM neural network as can be seen from Table 2, it is most of The error of data has very high predictablity rate less than 2%.

Experiment 2, the One-dimension Time Series prediction based on LSTM neural network

One-dimension Time Series prediction based on LSTM neural network, which refers to, utilizes grain yield history based on LSTM neural network Data predict the grain yield of next year.It and multidimensional time-series predict that maximum is not both to only use total grain output Data (one-dimensional data) is measured as trained and test data.- 1999 years 1949 total output of grain data are chosen in experiment to make For training dataset, 2000 to 2005 total output of grain data are chosen as test data set.Using mean square deviation (MSE) Performance metric is carried out to model as loss function.Test, training loss during LSTM One-dimension Time Series model training Value figure is as shown in Figure 5.

Fig. 5 shows the penalty values of training data and test data in 100 iterative process.After the completion of model training, make Test data with trained model to 2000 to 2005 is predicted, and carries out renormalization, reduction to predicted value Prediction data afterwards and truthful data comparing result are as shown in Figure 6.Prediction data and truthful data and its prediction error such as 3 institute of table Show.

- 2005 years 3 2000 years grain yield truthful datas of table and prediction data compare

Test 3, the Grain Crop Yield Prediction based on BP neural network

BP neural network is a kind of network that error progress model training is calculated according to back-propagation algorithm.It is mainly pair The brain of the mankind is simulated, and when model training need to only give input data and output data, and the mapping without describing variable is closed System, it can be automatically stored and learn inputoutput data, have very strong nonlinear system simulation ability.It is used herein to include One input layer, three hidden layers, an output layer BP neural network structure total output of grain is predicted.It is selected in experiment It takes -1999 years 1949 grain yield data as training dataset, chooses 2000 to 2005 grain yield data As test data set.Performance metric is carried out to model as loss function using mean square deviation (MSE).Fig. 7 shows BP nerve Test and training penalty values curve graph in network model training process.

Test data using trained model to 2000 to 2005 predicts, truthful data and prediction data Comparing result is as shown in Figure 8.

Prediction data and truthful data and its prediction error are as shown in table 4.

- 2005 years 4 2000 years grain yield truthful datas of table and prediction data compare

Time	True value	Predicted value	Error rate
				2000	3837.7	4060.180	5.797%
2001	3720.6	3905.564	4.971%
				2002	3292.7	3445.498	4.641%
2003	3435.5	3556.124	3.511%
				2004	3516.7	3476.882	1.132%
2005	3917.4	4112.364	4.977%

The multidimensional time-series prediction model based on LSTM neural network has above been used respectively, is based on LSTM nerve net One-dimension Time Series prediction model, the BP neural network model of network predict -2005 years 2000 total output of grains. It can be seen that the prediction error of the multidimensional time-series prediction model based on LSTM neural network is minimum in experiment, BP neural network The prediction error of model is maximum.

In order to more intuitively be compared to three kinds of methods, table 5 calculates three kinds of sides using mean percent error (MAPE) The prediction error of method.Mean percent error is as shown in the table.

5 three kinds of prediction technique MAPE value comparisons of table

Method	Mean percent error
		Multidimensional time-series prediction based on LSTM neural network	1.726%
One-dimension Time Series prediction based on LSTM neural network	3.973%
		Grain Crop Yield Prediction based on BP neural network	4.172%

The above results show that the mean percent error of the multidimensional time-series prediction technique based on LSTM neural network is far small In One-dimension Time Series prediction technique and Multi-layer BP Neural Network prediction technique, good prediction effect is reached.

Described above is only basic principle and preferred embodiment of the invention, and those skilled in the art do according to the present invention Improvement and replacement out, belong to the scope of protection of the present invention.

Claims

1. the total output of grain multidimensional time-series prediction technique based on LSTM neural network, it is characterised in that: including following step It is rapid: S01), input data is pre-processed, a variety of agricultural output data in the special time period of taking over as LSTM mind Training set through network, before being trained to LSTM neural network, the pretreatment that the data of training set are normalized first, If training set is matrix X,

By becoming matrix after normalized:

WhereinI=1,2 ..., n, j=1,2 ..., 24, here

For matrix X^RThe minimum value of jth column variable,For matrix X^RThe maximum value of jth column variable,For matrix X^RJth column variable it is very poor, by transformation after, gained agricultural output numerical value is in 0- Between 1, normalized is completed；

S02), design LSTM neural network framework, LSTM neural network framework include input layer, LSTM layers, Dropout layers, LSTM layers, Dropout layers, Dense layers and output layer, pretreated data are passed through in input layer input, then in turn through LSTM Layer, Dropout layers, LSTM layers, Dropout layers, Dense layers of interative computation, finally export prediction result；Wherein make for LSTM layers Input data is calculated with LSTM function；Dropout layers train the model over-fitting come in order to prevent, it is instructed in model Temporarily disconnect the neuron of specified quantity during white silk when undated parameter at random；Dense layers are full articulamentums, and operation is Output=activation (dot (input, weight)+bias), wherein activation is activation primitive, and dot is matrix Multiplication operation function, input is Dense layers of input and the data of input layer successively pass through LSTM layers, Dropout layers, LSTM layers, the obtained output valve of Dropout layers of operation, weight is weight matrix, and bias is bias, and output is exactly most The prediction result exported afterwards.

2. the total output of grain multidimensional time-series prediction technique according to claim 1 based on LSTM neural network, Be characterized in that: LSTM layers using cumulative linear forms come processing sequence data information, the calculating process inside LSTM layers is；

f_t=σ (W_f·[h_t-1,X_t]+b_f) (1),

i_t=σ (W_i·[h_t-1,X_t]+b_i) (2),

o_t=σ (W_o·[h_t-1,X_t]+b_o) (5),

h_t=o_t·tanh(c_t) (6),

Wherein, f_tIt indicates to forget door, σ indicates sigmoid function, W_fIndicate the weight matrix of forgetting door, h_t-1Indicate last moment Output, X_tIndicate the input data at current time, [h_t-1,X_t] indicate h_t-1、X_tBe spliced into the horizontal direction it is one long to Amount, b_fIndicate the bias of forgetting door, i_tIndicate input gate, W_iIndicate the weight matrix of input gate, b_iIndicate the biasing of input gate Value,Indicate location mode currently entered, W_cIndicate the weight matrix of location mode currently entered, b_cIndicate current input Location mode bias, c_tIndicate the location mode at current time, it is by forgetting door f_tMultiplied by the cell-like of last moment State c_t-1, add input gate i_tMultiplied by location mode currently enteredo_tIndicate out gate, W_oIndicate the weight square of out gate Battle array, b_oIndicate the bias of out gate, h_tIndicate final output, it is by out gate o_tMultiplied by current time location mode c_t's Tanh functional value.

3. the total output of grain multidimensional time-series prediction technique according to claim 1 based on LSTM neural network, Be characterized in that: activation primitive is ReLU function.

4. the total output of grain multidimensional time-series prediction technique according to claim 1 based on LSTM neural network, Be characterized in that: LSTM layers of input data is the 3D tensor that shape is (samples, timesteps, input_dim), Samples indicates that sample size, timesteps indicate the length of time window, and input_dim indicates the dimension of input data.

5. the total output of grain multidimensional time-series prediction technique according to claim 1 based on LSTM neural network, Be characterized in that: when taking training set, using the time as unit, a variety of agricultural output data for the M of taking over are as LSTM nerve net The training set of network, M are positive integer.