CN115359310B

CN115359310B - SIC prediction method and system based on ConvLSTM and conditional random field

Info

Publication number: CN115359310B
Application number: CN202210798362.4A
Authority: CN
Inventors: 张辉; 任开军; 李小勇; 任小丽; 谭家明; 蓝云杰; 柴星宇; 徐青
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2022-07-08
Filing date: 2022-07-08
Publication date: 2023-09-01
Anticipated expiration: 2042-07-08
Also published as: CN115359310A

Abstract

The invention discloses a SIC prediction method and a system based on ConvLSTM and a conditional random field, which comprise the steps of obtaining historical SIC data of a plurality of continuous moments, carrying out normalization processing on the historical SIC data, carrying out time step splicing on the processed historical SIC data through a time sliding window to generate a sequence data sample set, and inputting the sequence data sample set into a preset prediction model to obtain SIC prediction results of the next moment output by the prediction model, wherein the prediction model is a ConvLSTM network model combining the fully connected conditional random field, and the SIC prediction results of the next moment output by the prediction model are obtained by taking the prediction results provided by the ConvLSTM as the priori probability of the fully connected conditional random field, so that the prediction accuracy of the whole SIC and the prediction accuracy of a boundary area are improved.

Description

SIC prediction method and system based on ConvLSTM and conditional random field

Technical Field

The invention relates to the technical field of SIC prediction, in particular to a SIC prediction method and system based on ConvLSTM and a conditional random field.

Background

In recent years, with global warming, arctic sea ice coverage is gradually reduced, and arctic regions are also receiving increasing attention. The change in sea ice concentration directly reflects the change in sea ice coverage. Predicting sea ice concentration (Sea Ice Concentration, SIC) is important for understanding the effects of climate change, resource exploitation, and new arctic planning.

The traditional sea ice prediction method mainly comprises a statistical model or a numerical model. The statistical model is driven by data, and a fitting sea ice change rule is constructed according to historical observation data. The observed data mainly includes meteorological elements (e.g., temperature, sea level pressure), marine elements (e.g., sea surface temperature, salinity), and sea ice elements (e.g., concentration, range, ice type, and thickness). However, statistical methods cannot take into account the interactions between sea ice and the atmosphere and can only build a point-by-point model, ignoring interactions between neighboring points.

In recent years, machine learning methods have begun to be applied to sea ice prediction. For example, two deep learning methods, namely MLP (Multilayer Perceptron, MLP) and LSTM (Long Short-Term Memory Network, LSTM), can be used for metaphase prediction of the north SIC, and the effect is better than that of the traditional autoregressive model. However, LSTM only considers time-series information, and cannot process spatial information. Because LSTM needs to convert two-dimensional space data into one-dimensional data for calculation, original space structural features are difficult to keep in the process, and the prediction effect is poor when the sea ice coverage area is changed severely in melting and freezing seasons. In addition, the convolution operation can well consider the spatial correlation, and SIC of the next month of the month prediction is realized through CNN (Convolutional Neural Networks, CNN). However, CNN has obvious defects that SIC at time t is completely dependent on SIC at time t-1, and information in all past time cannot be extracted.

Disclosure of Invention

The present invention aims to at least solve the technical problems existing in the prior art. Therefore, the SIC prediction method and system based on ConvLSTM and conditional random fields, and the prediction accuracy of the whole SIC and the prediction accuracy of the boundary area are improved by introducing the conditional random fields into ConvLSTM.

In a first aspect of the present invention, there is provided a SIC prediction method based on ConvLSTM and conditional random fields, comprising the steps of:

acquiring historical SIC data at a plurality of continuous moments, carrying out normalization processing on the historical SIC data, and carrying out time step splicing on the processed historical SIC data through a time sliding window to generate a sequence data sample set;

and inputting the sequence data sample set into a preset prediction model to obtain a SIC prediction result of the next moment output by the prediction model, wherein the prediction model is a ConvLSTM network model combined with a full-connection conditional random field.

According to the embodiment of the invention, at least the following technical effects are achieved:

according to the method, historical SIC data of a plurality of continuous moments are obtained, normalization processing is carried out on the historical SIC data, the processed historical SIC data are spliced in a time step mode through a time sliding window, a sequence data sample set is generated, the sequence data sample set is input into a preset prediction model, and SIC prediction results of the next moment output by the prediction model are obtained, wherein the prediction model is a ConvLSTM network model combining a full-connection conditional random field, and the prediction results provided by the ConvLSTM are used as priori probabilities of the full-connection conditional random field to obtain SIC prediction results of the next moment output by the prediction model, so that the prediction accuracy of the integral SIC and the prediction accuracy of a boundary area are improved.

According to some embodiments of the invention, the internal structural formula of the ConvLSTM network model includes:

i _t ＝σ(W _xi *X _t +W _hi *H _t-1 +W _ci ⊙C _t-1 +b _i )

f _t ＝σ(W _xf *X _t +W _hf *H _t-1 +W _cf ⊙C _t-1 +b _f )

C _t ＝f _t ⊙C _t-1 +i _t ⊙tanh(W _xc *X _t +W _hc *H _t-1 +b _c )

o _t ＝σ(W _xo *X _t +W _ho *H _t-1 +W _co ⊙C _t +b _o )

H _t ＝o _t ⊙tanh(C _t )

wherein σ is the gate, which is the convolution operator, as well as the inner product, tanh is the hyperbolic tangent function, i _t An input gate at time t, f _t Forgetting door at t moment, C _t Memory cell at time t, o _t Output gate at time t, X _t Is a two-dimensional image at the time t, W _xi Is X as a two-dimensional image related to an input door _t Weight matrix of (H) _t As the network predicted value at time t, W _hi Is H as a hidden state related to the input door _t-1 Weight matrix, W of (2) _ci The memory cell associated with the input gate is C _t-1 Weight matrix of b) _i To input the bias value of the gate, W _xf Is X as a two-dimensional image related to a forgetting door _t Weight matrix, W of (2) _hf Is H as a hidden state related to the forgetful door _t-1 Weight matrix, W of (2) _cf The memory unit related to the forgetting gate is C _t-1 Weight matrix of b) _f Is the bias value of the forgetting gate, W _xc Is X as a two-dimensional image related to a memory unit _t Weight matrix, W of (2) _hc Is associated with a memory cell and has a hidden state H _t-1 Weight matrix of b) _c Is the bias value of the memory cell, W _xo Is X as a two-dimensional image related to an output gate _t Weight matrix, W of (2) _ho Is H as a hidden state related to an output door _t-1 Weight matrix, W of (2) _co The memory cell associated with the output gate is C _t Weight matrix of b) _o To output the gate bias value.

According to some embodiments of the present invention, the inputting the sequence data sample set into a preset prediction model to obtain a SIC prediction result at a next moment output by the prediction model includes:

inputting the sequence data sample set into a ConvLSTM network model in the preset prediction model to obtain a first SIC prediction result, wherein the first SIC prediction result is the SIC prediction result of the next moment output by the ConvLSTM network model in the prediction model;

obtaining a pixel set X= { X according to the first SIC prediction result ₁ ,x ₂ …x _n Tag set l= { L of tag categories ₁ ,L ₂ …L _k Characteristic vector f= { F } ₁ ,f ₂ …f _n Color feature vector i= { I } ₁ ,I ₂ …I _n Position feature vector p= { P } ₁ ,p ₂ …p _n Pixel i is classified as label x _i Probability P (x) _i ) Wherein x is _n Classification label for pixel n, L _k Classifying labels for the kth category in the label set, p _n Is the position characteristic vector of the pixel point n, I _n Is the color feature vector of the pixel point n, f _n Is the characteristic vector of the pixel point n, and the f _n The method comprises the steps of including a position feature vector of the pixel point n and a color feature vector of the pixel point n;

The fully connected conditional random field satisfies gibbs distribution, and the probability function is as follows:

wherein Z (I) is a normalization factor, E (X|I) is an energy function;

defining an energy function on a full graph basis as:

E(X∣I)＝∑ _i ψ _u (x _i )+∑ _i, ψ _p (x _i ,x _j )

ψ _u (x _i )＝-log P(x _i )

wherein, psi is _u (x _i ) As a unitary potential function, ψ _p (x _i ,x _j ) As a binary potential function, μ (x _i ,x _j ) As a tag comparison function, if x _i ≠x _j Time mu (x) _i ,x _j ) Equal to 1, if said x _i ＝x _j Time mu (x) _i ,x _j ) Equal to 0,w ^m And (3) the weight corresponding to the Gaussian kernel function of the m-th category in the tag set is m epsilon [1, k.For the m-th class of Gaussian kernel function in the tag set, f _i Is the characteristic vector of the pixel point i, f _j Is the eigenvector of pixel j, p _i Is the position characteristic vector of the pixel point i, p _j Is the position characteristic vector of the pixel point j, I _i Is the color feature vector of pixel point I, I _j Is the color feature vector of the pixel point j, theta _α To control the scale of the position information, θ _β To control the scale of the color information, θ _γ The scale of the control position information;

iteratively updating Q (x) using an average field algorithm such that the K-L divergence between the probability function P (x) and the Q (x) reaches a predetermined value, then using Q _i (x _i ) Instead of P (x) _i ) And obtaining SIC prediction results of the next moment output by the prediction model, wherein the calculation steps are as follows:

Q(x)＝∏ _i Q _i (x _i )

wherein Q is _i (x _i ) For pixel i, classification label x _i I is the classification label of pixel i, l' is the prediction classification label of pixel i, Z _i Mu (l, l') is a tag comparison function, which is a normalization factor.

According to some embodiments of the invention, the ConvLSTM network model sets 4 ConvLSTM cells, and the parameter REturn_sequences of the 4 th ConvLSTM cell is set to false.

In a second aspect of the present invention, there is provided a SIC prediction system based on ConvLSTM and conditional random fields, the SIC prediction system based on ConvLSTM and conditional random fields comprising:

the data acquisition module is used for acquiring historical SIC data at a plurality of continuous moments, carrying out normalization processing on the historical SIC data, and carrying out time-step splicing on the processed historical SIC data through a time sliding window to generate a sequence data sample set;

the data output module is used for inputting the sequence data sample set into a preset prediction model to obtain a SIC prediction result of the next moment output by the prediction model, wherein the prediction model is a ConvLSTM network model combined with a full-connection conditional random field.

The system obtains historical SIC data at a plurality of continuous moments, performs normalization processing on the historical SIC data, performs time-step splicing on the processed historical SIC data through a time sliding window, generates a sequence data sample set, inputs the sequence data sample set into a preset prediction model to obtain SIC prediction results at the next moment output by the prediction model, wherein the prediction model is a ConvLSTM network model combining a full-connection conditional random field, and obtains SIC prediction results at the next moment output by the prediction model by taking the prediction results provided by the ConvLSTM as the priori probability of the full-connection conditional random field, thereby improving the prediction accuracy of the integral SIC and the prediction accuracy of a boundary region.

i _t ＝σ(W _xi *X _t +W _hi *H _t-1 +W _ci ⊙C _t-1 +b _i )

f _t ＝σ(W _xf *X _t +W _hf *H _t-1 +W _cf ⊙C _t-1 +b _f )

C _t ＝f _t ⊙C _t-1 +i _t ⊙tanh(W _xc *X _t +W _hc *H _t-1 +b _c )

o _t ＝σ(W _xo *X _t +W _ho *H _t-1 +W _co ⊙C _t +b _o )

H _t ＝o _t ⊙tanh(C _t )

According to some embodiments of the invention, the data output module comprises:

the first prediction module is used for inputting the sequence data sample set into a ConvLSTM network model in the preset prediction model to obtain a first SIC prediction result, wherein the first SIC prediction result is a SIC prediction result of the next moment output by the ConvLSTM network model in the prediction model;

the data generation module is used for generating a pixel set X= { X according to the first SIC prediction result ₁ ,x ₂ …x _n Tag set l= { L of tag categories ₁ ,L ₂ …L _k Characteristic vector f= { F } ₁ ,f ₂ …f _n Color feature vector i= { I } ₁ ,I ₂ …I _n Position feature vector p= { P } ₁ ,p ₂ …p _n Pixel i is classified as label x _i Probability P (x) _i ) Wherein x is _n Classification label for pixel n, L _k Classifying labels for the kth category in the label set, p _n Is the position characteristic vector of the pixel point n, I _n Is the color feature vector of the pixel point n, f _n Is the characteristic vector of the pixel point n, and the f _n The method comprises the steps of including a position feature vector of the pixel point n and a color feature vector of the pixel point n;

wherein Z (I) is a normalization factor, E (X|I) is an energy function;

defining an energy function on a full graph basis as:

E(X∣I)＝∑ _i ψ _u (x _i )+∑ _i,j ψ _p (x _i ,x _j )

ψ _u (x _i )＝-log P(x _i )

wherein, psi is _u (x _i ) As a unitary potential function, ψ _p (x _i ,x _j ) As a binary potential function, μ (x _i ,x _j ) As a tag comparison function, if x _i ≠x _j Time mu (x) _i ,x _j ) Equal to 1, if said x _i ＝x _j Time mu (x) _i ,x _j ) Equal to 0,w ^m For the weight corresponding to the Gaussian kernel function of the mth category in the tag set, m is E [1, k]。For the m-th class of Gaussian kernel function in the tag set, f _i Is the characteristic vector of the pixel point i, f _j Is the eigenvector of pixel j, p _i Is the position characteristic vector of the pixel point i, p _j Is the position characteristic vector of the pixel point j, I _i Is the color feature vector of pixel point I, I _j Is the color feature vector of the pixel point j, theta _α To control the scale of the position information, θ _β To control the scale of the color information, θ _γ The scale of the control position information;

a result output module for iteratively updating Q (x) using an average field algorithm to achieve a predetermined value of K-L divergence between the probability function P (x) and the Q (x), and then using Q _i (x _i ) Instead of P (x) _i ) And obtaining SIC prediction results of the next moment output by the prediction model, wherein the calculation steps are as follows:

Q(x)＝∏ _i Q _i (x _i )

In a third aspect of the invention, there is provided a ConvLSTM and conditional random field based SIC prediction electronics comprising at least one control processor and a memory for communicative connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform the ConvLSTM and conditional random field based SIC prediction method described above.

In a fourth aspect of the present invention, there is provided a computer-readable storage medium storing computer-executable instructions for causing a computer to perform the above-described ConvLSTM and conditional random field-based SIC prediction method.

It should be noted that the advantages of the second to fourth aspects of the present invention and the prior art are the same as those of the above-described SIC prediction system based on ConvLSTM and conditional random fields, and are not described in detail here.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:

FIG. 1 is a flow chart of a method for SIC prediction based on ConvLSTM and conditional random fields in accordance with an embodiment of the present invention;

FIG. 2 is a schematic diagram of a set of sequence data samples constructed in accordance with one embodiment of the present invention;

FIG. 3 is a diagram of a ConvLSTM network architecture provided by one embodiment of the present invention;

FIG. 4 is a comparison of ConvLSTM and ConvLSTM-CRF iterative predictions of MAE for the next 12 months provided by one embodiment of the present invention;

FIG. 5 is a comparison graph of ConvLSTM and ConvLSTM-CRF iterative predictions for a future 12 month RMSE, provided by an embodiment of the invention;

fig. 6 is a flow chart of a ConvLSTM and conditional random field based SIC prediction system in accordance with an embodiment of the present invention.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.

In the description of the present invention, the description of first, second, etc. is for the purpose of distinguishing between technical features only and should not be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.

In the description of the present invention, it should be understood that the direction or positional relationship indicated with respect to the description of the orientation, such as up, down, etc., is based on the direction or positional relationship shown in the drawings, is merely for convenience of describing the present invention and simplifying the description, and does not indicate or imply that the apparatus or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.

In the description of the present invention, unless explicitly defined otherwise, terms such as arrangement, installation, connection, etc. should be construed broadly and the specific meaning of the terms in the present invention can be determined reasonably by a person skilled in the art in combination with the specific content of the technical solution.

The machine learning method starts to be applied to sea ice prediction. For example, two deep learning methods, namely MLP (Multilayer Perceptron, MLP) and LSTM (Long Short-Term Memory Network, LSTM), can be used for metaphase prediction of the north SIC, and the effect is better than that of the traditional autoregressive model. However, LSTM only considers time-series information, and cannot process spatial information. Because LSTM needs to convert two-dimensional space data into one-dimensional data for calculation, original space structural features are difficult to keep in the process, and the prediction effect is poor when the sea ice coverage area is changed severely in melting and freezing seasons. In addition, the convolution operation can well consider the spatial correlation, and SIC of the next month of the month prediction is realized through CNN (Convolutional Neural Networks, CNN). However, CNN has obvious defects that SIC at time t is completely dependent on SIC at time t-1, and information in all past time cannot be extracted.

To solve the above technical drawbacks, referring to fig. 1 and 2, an embodiment of the present invention provides a SIC prediction method based on ConvLSTM and conditional random fields, including:

s101, acquiring historical SIC data of a plurality of continuous moments, carrying out normalization processing on the historical SIC data, and carrying out time step splicing on the processed historical SIC data through a time sliding window to generate a sequence data sample set.

S102, inputting the sequence data sample set into a preset prediction model to obtain SIC prediction results of the next moment output by the prediction model, wherein the prediction model is a ConvLSTM network model combined with a fully connected conditional random field.

Referring to FIG. 3, in some embodiments, the internal structural formula of the ConvLSTM network model includes:

i _t ＝σ(W _xi *X _t +W _hi *H _t-1 +W _ci ⊙C _t-1 +b _i )

f _t ＝σ(W _xf *X _t +W _hf *H _t-1 +W _cf ⊙C _t-1 +b _f )

C _t ＝f _t ⊙C _t-1 +i _t ⊙tanh(W _xc *X _t +W _hc *H _t-1 +b _c )

o _t ＝σ(W _xo *X _t +W _ho *H _t-1 +W _co ⊙C _t +b _o )

H _t ＝o _t ⊙tanh(C _t )

In some embodiments, inputting the sequence data sample set into a preset prediction model to obtain a SIC prediction result at the next moment output by the prediction model, including:

and inputting the sequence data sample set into a ConvLSTM network model in a preset prediction model to obtain a first SIC prediction result, wherein the first SIC prediction result is the SIC prediction result of the next moment output by the ConvLSTM network model in the prediction model.

Obtaining a pixel set X= { X according to the first SIC prediction result ₁ ,x ₂ …x _n Tag set l= { L of tag categories ₁ ,L ₂ …L _k Characteristic vector f= { F } ₁ ,f ₂ …f _n Color feature vector i= { I } ₁ ,I ₂ …I _n Position feature vector p= { P } ₁ ,p ₂ …p _n Pixel i is classified as label x _i Probability P (x) _i ) Wherein x is _n Classification label for pixel n, L _k Classifying tags for the kth category in the tag set, p _n Is the position characteristic vector of the pixel point n, I _n Is the color feature vector of pixel point n, f _n Is the characteristic vector of the pixel point n, f _n The method comprises a position characteristic vector of a pixel point n and a color characteristic vector of the pixel point n.

The fully connected conditional random field satisfies the gibbs distribution with the probability function as follows:

wherein Z (I) is a normalization factor, E (X|I) is an energy function.

Defining an energy function on a full graph basis as:

E(X∣I)＝∑ _i ψ _u (x _i )+∑ _i, ψ _p (x _i ,x _j )

ψ _u (x _i )＝-log P(x _i )

wherein, psi is _u (x _i ) As a unitary potential function, ψ _p (x _i ,x _j ) As a binary potential function, μ (x _i ,x _j ) As a tag comparison function, if x _i ≠x _j Time mu (x) _i ,x _j ) Equal to 1, if x _i ＝x _j Time mu (x) _i ,x _j ) Equal to 0,w ^m And (3) the weight corresponding to the Gaussian kernel function of the m-th category in the tag set is m epsilon [1, k.For the Gaussian kernel function of the m-th class in the tag set, f _i Is the characteristic vector of the pixel point i, f _j Is the eigenvector of pixel j, p _i Is the position characteristic vector of the pixel point i, p _j Is the position characteristic vector of the pixel point j, I _i Is the color feature vector of pixel point I, I _j Is the color feature vector of the pixel point j, theta _α To control the scale of the position information, θ _β To control the scale of the color information, θ _γ To control the scale of the location information.

Iteratively updating Q (x) using an average field algorithm such that the K-L divergence between the probability functions P (x) and Q (x) reaches a predetermined value, then using Q _i (x _i ) Instead of P (x) _i ) The SIC prediction result of the next moment output by the prediction model is obtained, and the calculation steps are as follows:

Q(x)＝∏ _i Q _i (x _i )

In some embodiments, the ConvLSTM network model sets 4 ConvLSTM cells, and the parameter REturn_sequences of the 4 th ConvLSTM cell is set to false.

For ease of understanding by those skilled in the art, a set of preferred embodiments are provided below:

1. generating a sequence data sample set:

acquiring historical SIC data at a plurality of continuous moments, carrying out normalization processing on the historical SIC data, and carrying out time step splicing on the processed historical SIC data through a time sliding window to generate a sequence data sample set.

The historical SIC data is selected from monthly SIC data of NSIDC fourth edition, the research area is selected from North Pole, the horizontal resolution is 25×25km, the SIC coverage is 31.1N-89.8N,180E-180W, the time span is 11 months 1978 to 12 months 2020, and the total time is 506 months, wherein all missing values of land and the like in the SIC data are assigned to 0, which means that no sea ice coverage exists.

Referring to fig. 2, a sequence data sample set was constructed based on data from month 11 in 1978 to month 12 in 2020 for 506 months. From the first month, data was progressively divided as a group every 13 months. Of each set of data, the first 12 months of data were entered as a model and the 13 th month of data were used as tag values. 506 months of data are divided into 494 groups, the first 422 groups are used as training sets, the 423 th to 446 th groups are used as verification sets, 24 groups are used as Test sets, and 48 groups are used as Test sets, wherein model training is performed by using the training sets, train Set is used as the training Set, validation Set is used as the verification Set, and Test Set is used as the Test Set in fig. 2.

2. Obtaining a prediction result:

Obtaining a pixel set X= { X according to the first SIC prediction result ₁ ,x ₂ …x _n Tag set l= { L of tag categories ₁ ,L ₂ …L _k Characteristic vector f= { F } ₁ ,f ₂ …f _n Color feature vector i= { I } ₁ ,I ₂ …I _n Position feature vector p= { P } ₁ ,p ₂ …p _n Pixel i is classified as label x _i Probability P (x) _i ) Wherein x is _n Classification label for pixel n, L _k Classifying tags for the kth category in the tag set, p _n Is the position characteristic vector of the pixel point n, I _n Is the color feature vector of pixel point n, f _n Is the characteristic vector of the pixel point n, f _n The method comprises a position feature vector of a pixel point n and a color feature vector of the pixel point n;

the fully connected conditional random field satisfies the gibbs distribution with the probability function as follows.

Wherein Z (I) is a normalization factor, E (X|I) is an energy function.

Defining an energy function on a full graph basis as:

E(X∣I)＝∑ _i ψ _u (x _i )+∑ _i,j ψ _p (x _i ,x _j )

ψ _u (x _i )＝-log P(x _i )

wherein, psi is _u (x _i ) As a unitary potential function, ψ _p (x _i ,x _j ) As a binary potential function, μ (x _i ,x _j ) As a tag comparison function, if x _i ≠x _j Time mu (x) _i ,x _j ) Equal to 1, if x _i ＝x _j Time mu (x) _i ,x _j ) Equal to 0,w ^m For the weight corresponding to the Gaussian kernel function of the mth category in the tag set, m is E [1, k]。Gaussian kernel function for the m-th class in a tag set，f _i Is the characteristic vector of the pixel point i, f _j Is the eigenvector of pixel j, p _i Is the position characteristic vector of the pixel point i, p _j Is the position characteristic vector of the pixel point j, I _i Is the color feature vector of pixel point I, I _j Is the color feature vector of the pixel point j, theta _α To control the scale of the position information, θ _β To control the scale of the color information, θ _γ To control the scale of the location information.

Q(x)＝∏ _i Q _i (x _i )

The internal structural formula of the ConvLSTM network model comprises:

i _t ＝σ(W _xi *X _t +W _hi *H _t-1 +W _ci ⊙C _t-1 +b _i )

f _t ＝σ(W _xf *X _t +W _hf *H _t-1 +W _cf ⊙C _t-1 +b _f )

C _t ＝f _t ⊙C _t-1 +i _t ⊙tanh(W _xc *X _t +W _hc *H _t-1 +b _c )

o _t ＝σ(W _xo *X _t +W _ho *H _t-1 +W _co ⊙C _t +b _o )

H _t ＝o _t ⊙tanh(C _t )

The ConvLSTM network model sets 4 ConvLSTM cells, and the parameter REturn_sequences of the 4 th ConvLSTM cell is set to false.

3. Experiment and results:

ConvLSTM and ConvLSTM-CRF performance differences were compared experimentally using a sample set of sequence data.

The single month prediction ability and the iterative prediction ability of the test model are tested, and the single month prediction refers to the prediction of the next month T13 by inputting 12 months of data of T1-T12. The iterative prediction is to use the predicted value as input and predict the value of the next moment, for example, the predicted value T13 is used as the input for predicting the next moment, the next month T14 is predicted based on 12 months of data of T2-T13, a plurality of months in the future are iteratively predicted according to the predicted value, 48 groups of data of the training set are used for single month prediction, and data of 1 month in 2019-12 months in 2019 are used for iteratively predicting data of 12 months in the future.

The prediction results of ConvLSTM and ConvLSTM-CRF are compared by MAE (Mean Absolute Error, MAE) and RMSE (Root Mean Squared Error, RMSE), convLSTM-CRF is used as a prediction model, and the calculation method of the two evaluation indexes is as follows:

to eliminate the influence of open water and land which do not change with time, the 0 value in the predicted result is removed, wherein n is the number of non-zero values in the predicted SIC, y _i Is the true value of SIC, p _i Is a predicted value.

In the single month prediction experiment, convLSTM and ConvLSTM-CRF are used for respectively predicting 48 groups of data of the test set, and the season and annual average MAE and RMSE percentage values of the prediction results are shown in tables 1 and 2:

table 1 shows a comparison of seasonal average MAE and annual average MAE for ConvLSTM model and ConvLSTM-CRF model;

TABLE 1

Table 2 shows a comparison of seasonal average MAE and annual average RMSE for ConvLSTM model and ConvLSTM-CRF model;

TABLE 2

As can be seen from tables 1 and 2, the prediction of ConvLSTM-CRF is superior to ConvLSTM in that sea ice melts to produce a pool in the thawing season, both MAE and RMSE are higher than the average in the freezing season, but both MAE and RMSE of ConvLSTM-CRF are lower than those of ConvLSTM.

Referring to fig. 4 and 5, in the iterative prediction experiment, data of 1 month in 2019 to 12 months in 2019 are used to iteratively predict data of 12 months in the future. Firstly, data from 1 month in 2019 to 12 months in 2019 are input into two models, a prediction result of 1 month in 20 years is obtained, and the prediction result is used as input of the next prediction. Finally, data from 1 month in 2020 to 12 months in 2020 are obtained. The monthly MAE and RMSE percentage values of the predicted results are shown in fig. 4 and 5. As can be seen from the figure, both the MAE and MRSE of the two models grow with increasing iteration number. ConvLSTM-CRF grows more slowly, however, and its MAE and MRSE are also smaller than ConvLSTM.

The invention combines ConvLSTM with fully connected conditional random fields, and proposes a ConvLSTM-CRF model to predict future moon SIC of North. Experiments prove that the prediction precision of the ConvLSTM-CRF model is superior to that of a ConvLSTM method, and the SIC boundary range can be predicted more accurately. Especially, when SIC changes drastically in thawing and freezing seasons, the prediction result of ConvLSTM-CRF model is closer to the real situation. In the sea ice iterative prediction, the ConvLSTM-CRF model also shows a better prediction effect, in addition, the ConvLSTM-CRF model can generalize application scenes, and all SIC-like time sequence prediction problems can adopt the model, such as precipitation prediction and snowfall prediction.

In addition, referring to fig. 6, the present invention further provides a SIC prediction system based on ConvLSTM and conditional random fields, including a data acquisition module 1100 and a data output module 1200, where:

the data acquisition module 1100 is configured to acquire historical SIC data at a plurality of consecutive moments, normalize the historical SIC data, and perform time-step splicing on the processed historical SIC data through a time sliding window to generate a sequence data sample set;

the data output module 1200 is configured to input the sequence data sample set into a preset prediction model, and obtain a SIC prediction result at a next moment output by the prediction model, where the prediction model is a ConvLSTM network model combined with a fully connected conditional random field.

In some embodiments, the internal structural formula of the ConvLSTM network model includes:

i _t ＝σ(W _xi *X _t +W _hi *H _t-1 +W _ci ⊙C _t-1 +b _i )

f _t ＝σ(W _xf *X _t +W _hf *H _t-1 +W _cf ⊙C _t-1 +b _f )

C _t ＝f _t ⊙C _t-1 +i _t ⊙tanh(W _xc *X _t +W _hc *H _t-1 +b _c )

o _t ＝σ(W _xo *X _t +W _ho *H _t-1 +W _co ⊙C _t +b _o )

H _t ＝o _t ⊙tanh(C _t )

In some embodiments, the data output module comprises:

the first prediction module is used for inputting the sequence data sample set into a ConvLSTM network model in a preset prediction model to obtain a first SIC prediction result, wherein the first SIC prediction result is a SIC prediction result of the next moment output by the ConvLSTM network model in the prediction model;

the data generation module is used for generating a pixel set X= { X according to the first SIC prediction result ₁ ,x ₂ …x _n Tag set l= { L of tag categories ₁ ,L ₂ …L _k Characteristic vector f= { F } ₁ ,f ₂ …f _n Color feature vector i= { I } ₁ ,I ₂ …I _n Position feature vector p= { P } ₁ ,p ₂ …p _n Pixel i is classified as label x _i Probability P (x) _i ) Wherein x is _n Classification label for pixel n, L _k Classifying tags for the kth category in the tag set, p _n Is the position characteristic vector of the pixel point n, I _n Is the color feature vector of pixel point n, f _n Is the characteristic vector of the pixel point n, f _n The method comprises a position feature vector of a pixel point n and a color feature vector of the pixel point n;

wherein Z (I) is a normalization factor, E (X|I) is an energy function;

Defining an energy function on a full graph basis as:

E(X∣I)＝∑ _i ψ _u (x _i )+∑ _i,j ψ _p (x _i ,x _j )

ψ _u (x _i )＝-log P(x _i )

wherein, psi is _u (x _i ) As a unitary potential function, ψ _p (x _i ,x _j ) As a binary potential function, μ (x _i ,x _j ) As a tag comparison function, if x _i ≠x _j Time mu (x) _i ,x _j ) Equal to 1, if x _i ＝x _j Time mu (x) _i ,x _j ) Equal to 0,w ^m For the weight corresponding to the Gaussian kernel function of the mth category in the tag set, m is E [1, k]。For the Gaussian kernel function of the m-th class in the tag set, f _i Is the characteristic vector of the pixel point i, f _j Is the eigenvector of pixel j, p _i Is the position characteristic vector of the pixel point i, p _j Is the position characteristic vector of the pixel point j, I _i Is the color feature vector of pixel point I, I _j Is the color feature vector of the pixel point j, theta _α To control the scale of the position information, θ _β To control the scale of the color information, θ _γ The scale of the control position information;

a result output module for iteratively updating Q (x) using an average field algorithm to achieve a predetermined value of the K-L divergence between the probability functions P (x) and Q (x), and then using Q _i (x _i ) Instead of P (x) _i ) The SIC prediction result of the next moment output by the prediction model is obtained, and the calculation steps are as follows:

Q(x)＝∏ _i Q _i (x _i )

It should be noted that, the embodiment of the present system and the embodiment of the method described above are based on the same inventive concept, so that the relevant content of the embodiment of the method described above is also applicable to the embodiment of the present system, and will not be repeated here.

The processor and the memory may be connected by a bus or other means.

The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The non-transitory software programs and instructions required to implement the ConvLSTM and conditional random field based SIC prediction method of the above embodiments are stored in a memory, which when executed by a processor, performs the ConvLSTM and conditional random field based SIC prediction method of the above embodiments, for example, performs the method steps S101 to S102 in fig. 1 described above.

The present application also provides a computer-readable storage medium storing computer-executable instructions for performing: SIC prediction methods based on ConvLSTM and conditional random fields as described above.

The computer-readable storage medium stores computer-executable instructions that are executed by a processor or controller, for example, by a processor in the above-described electronic device embodiment, which may cause the processor to perform the convls (tm) and conditional random field based SIC prediction method in the above-described embodiment, for example, to perform the method steps S101 to S102 in fig. 1 described above.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program elements or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program elements or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.

The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of one of ordinary skill in the art without departing from the spirit of the present invention.

Claims

1. A method for SIC prediction based on ConvLSTM and conditional random fields, the method comprising:

inputting the sequence data sample set into a preset prediction model to obtain a SIC prediction result of the next moment output by the prediction model, wherein the prediction model refers to a ConvLSTM network model combined with a full-connection conditional random field, and specifically comprises the following steps:

Obtaining a pixel set X= { X according to the first SIC prediction result ₁ ，x ₂ ...x _n Tag set l= { L of tag categories ₁ ，L ₂ ...L _k Characteristic vector f= { F } ₁ ，f ₂ ...f _n Color feature vector i= { I } ₁ ，I ₂ ...I _n Position feature vector p= { P } ₁ ，p ₂ ...p _n Pixel i is classified as label x _i Probability P (x) _i ) Wherein x is _n Classification label for pixel n, L _k Classifying labels for the kth category in the label set, p _n Is the position characteristic vector of the pixel point n, I _n Is the color feature vector of the pixel point n, f _n Is the characteristic vector of the pixel point n, and the f _n The method comprises the steps of including a position feature vector of the pixel point n and a color feature vector of the pixel point n;

wherein Z (I) is a normalization factor, E (X|I) energy function;

defining an energy function on a full graph basis as:

E(X|I)＝∑ _i ψ _u (x _i )+∑ _i，j ψ _p (x _i ，x _j )

ψ _u (x _i )＝-log P(x _i )

wherein, psi is _u (x _i ) As a unitary potential function, ψ _p (x _i ，x _j ) As a binary potential function, μ (x _i ，x _j ) As a tag comparison function, if x _i ≠x _j Time mu (x) _i ，x _j ) Equal to 1, if said x _i ＝x _j Time mu (x) _i ，x _j ) Equal to 0,w ^m For the weight corresponding to the Gaussian kernel function of the mth category in the tag set, m is E [1, k]，For the m-th class of Gaussian kernel function in the tag set, f _i Is the characteristic vector of the pixel point i, f _j Is the eigenvector of pixel j, p _i Is the position characteristic vector of the pixel point i, p _j Is the position characteristic vector of the pixel point j, I _i Is the color feature vector of pixel point I, I _j Is the color feature vector of the pixel point j, theta _α To control the scale of the position information, θ _β To control the scale of the color information, θ _γ To control the scale of the position information, W ⁽¹⁾ Weight corresponding to Gaussian kernel function of first category in label set, w ⁽²⁾ The weight corresponding to the Gaussian kernel function of the second category in the tag set is given;

Q(x)＝П _i Q _i (x _i )

wherein Q is _i (x _i ) For pixel i, classification label x _i I is the classification label of pixel i, l' is the prediction classification label of pixel i, Z _i Mu (l, l') is a tag comparison function, ψ, as a normalization factor _u (x _i ) As a unitary potential function, w ^m And the weight corresponding to the Gaussian kernel function of the m-th category in the tag set.

2. The method for SIC prediction based on ConvLSTM and conditional random fields according to claim 1, wherein the internal structural formula of the ConvLSTM network model comprises:

i _t ＝σ(W _xi *x _t +W _hi *H _t-1 +W _ci ⊙C _t-1 +b _i )

f _t ＝σ(W _xf *X _t +W _hf *H _t-1 +W _cf ⊙C _t-1 +b _f )

C _t ＝f _t ⊙C _t-1 +i _t ⊙tanh(W _xc *X _t +W _hc *H _t-1 +b _c )

o _t ＝σ(W _xo *X _t +W _ho *H _t-1 +W _co ⊙C _t +b _o )

H _t ＝o _t ⊙tanh(C _t )

Wherein σ is the gate, which is the convolution operator, as well as the inner product, tanh is the hyperbolic tangent function, i _t An input gate at time t, f _t Forgetting door at t moment, C _t Memory cell at time t, o _t Is an output gate at the time t,X _t is a two-dimensional image at the time t, W _xi Is X as a two-dimensional image related to an input door _t Weight matrix of (H) _t As the network predicted value at time t, W _hi Is H as a hidden state related to the input door _t-1 Weight matrix, W of (2) _ci The memory cell associated with the input gate is C _t-1 Weight matrix of b) _i To input the bias value of the gate, W _xf Is X as a two-dimensional image related to a forgetting door _t Weight matrix, W of (2) _hf Is H as a hidden state related to the forgetful door _t-1 Weight matrix, W of (2) _cf The memory unit related to the forgetting gate is C _t-1 Weight matrix of b) _f Is the bias value of the forgetting gate, W _xc Is X as a two-dimensional image related to a memory unit _t Weight matrix, W of (2) _hc Is associated with a memory cell and has a hidden state H _t-1 Weight matrix of b) _c Is the bias value of the memory cell, W _xo Is X as a two-dimensional image related to an output gate _t Weight matrix, W of (2) _ho Is H as a hidden state related to an output door _t-1 Weight matrix, W of (2) _co The memory cell associated with the output gate is C _t Weight matrix of b) _o To output the gate bias value.

3. The method for predicting SIC based on ConvLSTM and conditional random field according to claim 2, wherein the ConvLSTM network model sets 4 ConvLSTM cells, and parameter return_sequences of the 4 th ConvLSTM cell is set to false.

4. A ConvLSTM and conditional random field based SIC prediction system, the system comprising:

the data output module is used for inputting the sequence data sample set into a preset prediction model to obtain a SIC prediction result of the next moment output by the prediction model, wherein the prediction model refers to a ConvLSTM network model combined with a full-connection conditional random field, and specifically comprises the following steps:

The data generation module is used for generating a pixel set X= { X according to the first SIC prediction result ₁ ，x ₂ ...x _n Tag set l= { L of tag categories ₁ ，L ₂ ...L _k Characteristic vector f= { F } ₁ ，f ₂ ...f _n Color feature vector i= { I } ₁ ，I ₂ ...I _n Position feature vector p= { P } ₁ ，p ₂ ...p _n Pixel i is classified as label x _i Probability P (x) _i ) Wherein x is _n Classification label for pixel n, L _k Classifying labels for the kth category in the label set, p _n Is the position characteristic vector of the pixel point n, I _n Is the color feature vector of the pixel point n, f _n Is the characteristic vector of the pixel point n, and the f _n The method comprises the steps of including a position feature vector of the pixel point n and a color feature vector of the pixel point n;

wherein Z (I) is a normalization factor, E (X|I) energy function;

defining an energy function on a full graph basis as:

E(X|I)＝∑ _i ψ _u (x _i )+∑ _i，j ψ _p (x _i ，x _j )

ψ _u (x _i )＝-log P(x _i )

a result output module for iteratively updating Q (x) using an average field algorithm to achieve a predetermined value of K-L divergence between the probability function P (x) and the Q (x), and then using Q _i (x _i ) Instead of P (x) _i ) Obtaining the next moment of the output of the prediction modelThe SIC prediction result of (2) is calculated as follows:

Q(x)＝П _i Q _i (x _i )

5. The SIC prediction system of claim 4, wherein the internal structural formula of the ConvLSTM network model includes:

i _t ＝σ(W _xi *X _t +W _hi *H _t-1 +W _ci ⊙C _t-1 +b _i )

f _t ＝σ(W _xf *X _t +W _hf *H _t-1 +W _cf ⊙C _t-1 +b _f )

C _t ＝f _t ⊙C _t-1 +i _t ⊙tanh(W _xc *X _t +W _hc *H _t-1 +b _c )

o _t ＝σ(W _xo *X _t +W _ho *H _t-1 +W _co ⊙C _t +b _o )

H _t ＝o _t ⊙tanh(C _t )

6. The SIC prediction system of claim 5, wherein the ConvLSTM network model sets 4 convlstmcells, and the parameter return_sequences of the 4 th ConvLSTMcell is set to false.

7. A ConvLSTM and conditional random field based SIC prediction apparatus comprising at least one control processor and a memory for communication with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a ConvLSTM and conditional random field based SIC prediction method according to any one of claims 1 to 3.

8. A computer-readable storage medium, characterized by: the computer-readable storage medium stores computer-executable instructions for causing a computer to perform a ConvLSTM and conditional random field based SIC prediction method according to any one of claims 1 to 3.