CN112464567B - Intelligent data assimilation method based on variational and assimilative framework - Google Patents

Intelligent data assimilation method based on variational and assimilative framework Download PDF

Info

Publication number
CN112464567B
CN112464567B CN202011420985.5A CN202011420985A CN112464567B CN 112464567 B CN112464567 B CN 112464567B CN 202011420985 A CN202011420985 A CN 202011420985A CN 112464567 B CN112464567 B CN 112464567B
Authority
CN
China
Prior art keywords
assimilation
time
mlp
data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011420985.5A
Other languages
Chinese (zh)
Other versions
CN112464567A (en
Inventor
黄丽蓝
宋君强
任开军
李小勇
冷洪泽
王东紫
卢竞择
陈睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202011420985.5A priority Critical patent/CN112464567B/en
Publication of CN112464567A publication Critical patent/CN112464567A/en
Application granted granted Critical
Publication of CN112464567B publication Critical patent/CN112464567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/12Timing analysis or timing optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/14Force analysis or force optimisation, e.g. static or dynamic forces
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method based on variationAn intelligent data assimilation method for assimilation frame includes S1 analyzing the first half data of assimilation window, and correcting assimilation time window [ t ] with the second half data 1 ,t n ]Intermediate analysis field, determining the analysis field at the current moment
Figure DDA0002822358780000011
S2 replacement assimilation process using the first multi-layered sensor, analyzing and obtaining the data according to history and future
Figure DDA0002822358780000012
S3 comparing the previous training data set with
Figure DDA0002822358780000013
Superposing and training the next multilayer perceptron to obtain
Figure DDA0002822358780000014
The method is equivalent to the method that a prediction model is simulated in a data assimilation period to carry out short-term prediction; s4, repeating S2 and S3, wherein each link before ct generates a new link
Figure DDA0002822358780000015
Before replacement
Figure DDA0002822358780000016
Form a new training data set until ct is ahead
Figure DDA0002822358780000017
Is completely replaced, and the background information in the following training data set is only composed of
Figure DDA0002822358780000018
Provided is a method.

Description

Intelligent data assimilation method based on variational and assimilative framework
Technical Field
The invention relates to the technical field of data assimilation, in particular to an intelligent data assimilation method based on a variational and assimilative framework.
Background
Numerical Weather Prediction (NWP) is a mainstream weather forecasting approach, which is defined by Bjerknes as an initial/boundary value problem. The future state of the atmosphere sensitively depends on its detailed Initial Conditions (IC), certain Boundary Conditions (BC) and a perfect forecasting model. The prediction model is mostly a mathematical physical model approximately describing the law of motion of the earth system, and is usually expressed by a differential equation. Generally, given the ICs and BCs of a forecasting model, the numerical solution of the forecasting model represents the corresponding numerical weather forecast. With the improvement of theory and the improvement of computing power, the precision of forecasting models and BCs is higher and higher. Therefore, the uncertainty of ICs becomes a factor of NWP prediction error. Data Assimilation (DA) is the integration of all available Data resources, including observations and short-term forecasts (i.e., background fields or initial guesses), to obtain an analysis field, which is expected to be the best estimate of the probability density of the actual atmospheric state. That is, the goal of the DA is to produce the most accurate IC for NWP. Therefore, the establishment of an optimal DA method is of high interest to researchers in the field of meteorology.
Based on bayesian theorem, the variational assimilation method (Var) was the predominant business method for NWP in the 20 th century before the 90 s. Basically, to handle uncertainties from different information, the guiding strategy for the variant assimilation approach is: the new observations are used to correct the initial guess at the specified grid points, i.e. the background field and the observations are combined by minimizing a cost function, which consists of two aspects: one part is a penalty term for the background mean distance and the other penalty term is the distance to the observed value. The specific implementation mode is that a large amount of observation data is directly assimilated by providing a static background error covariance matrix (B), wherein the matrix B is calculated by climate data once and cannot well represent real space-time error statistics, which means that the matrix B may introduce a large background error. Var is divided into 3D-Var, which uses single-time data, and 4D-Var, which uses data for a time window of a given length to estimate initial state. Although in 4D-Var B implicitly evolves with flow-dependent changes and better initial states than 3D-Var can be obtained, it requires expensive computational expense and requires the building of linear and adjoint numerical models, which is difficult to build.
As a data-driven approach, Machine Learning (ML) is the study of computer algorithms that are automatically improved by experience. In a computer system, experiences mainly exist in the form of data, and ML establishes a mathematical model according to sample data to predict and make decisions. ML has enjoyed significant success in various fields of computer vision, speech recognition and control systems, as well as in related scientific fields of physics, chemistry and biology. The development of ML has brought new opportunities for DA, and some researchers have attempted to explore suitable ML algorithms to solve local problems of traditional and single DA approaches. As one of the geosciences system problems, DA mainly includes three stages, and errors still occur at each stage, and there is no general DA technique that can ideally eliminate the errors. The first stage of DA is the pre-processing and quality control of the observed data, in order to eliminate the errors of the observers, researchers usually use an ideal function to invert the indirect observed data, and use an empirical observation operator (matrix) to interpolate the observed data into a numerical grid, which introduces errors that cannot be underestimated. ML can learn the time-space characteristics and the correlation of data, which has important significance for improving errors introduced by observed values. The second stage is objective analysis, integrating the observed data with the background information provided by the numerical model. Generally, functions of mathematical models for simulating the evolution of the earth system based on physical theory use empirical parameters, and have semi-empirical properties, so that these numerical models can be replaced by ML. The ML method has been shown to be more powerful and flexible than previous mechanistic or semi-empirical modeling methods. Furthermore, compared to current sequential assimilation of DA, ML is parallelizable, which is very helpful to speed up the assimilation process. The third stage is initialization, and the result of target analysis needs to be properly adjusted to avoid the problem that subsequent integration cannot be performed due to noise generated by inconsistent prediction models. Therefore, the output of the DA depends on the uncertainty of the numerical model. Based on a large amount of earth system data, model errors are corrected by using an ML method, so that the quality of DA results is improved, and some researchers pay attention.
The final goal of data assimilation is to quickly and accurately obtain the optimal analysis field and provide the optimal initial field for the numerical weather forecast mode. The existing variational assimilation method is based on the Bayesian theory, and has the advantages of directly assimilating a large amount of observation data and simple basic framework. But the storage and direct calculation of the large matrix B are involved, and high requirements are imposed on the storage capacity and the calculation capacity of a calculation device; furthermore, the results obtained by the variational assimilation method, i.e., the analysis field, are generally not the best results for the analysis field, since they are based on a variety of ideal assumptions. In other words, the existing DA directly depends on the quality of the observed field and the background field. The former is measured by various observation devices and obtained by a specific empirical inversion function, and the latter is provided by a numerical model. There occur inevitable errors that need to be eliminated. Based on the first-order Markov assumption and Bayes theorem, the integration and assimilation processes of the numerical model are sequentially performed and are not parallel, so that an assimilation period of the traditional DA method is long and needs to be accelerated.
Disclosure of Invention
To build a general model of the DA, the historical y is used o And x b And future y o Included therein, optimizing current x in a DA system a Secondly, different models are used for simulation, the process of model prediction is accelerated, and the technical problems in the prior art are solved.
The technical scheme of the invention is as follows: an intelligent data assimilation method based on a variation assimilation framework comprises the following steps:
s1, analyzing the data in the first half of the assimilation window, and correcting the assimilation time window [ t ] with the data in the second half 1 ,t n ]Intermediate analytical field of formula
Figure BDA0002822358760000031
Determining the analysis field of the current time
Figure BDA0002822358760000041
In which
Figure BDA0002822358760000042
n (1, 2, 3.) is the length of the assimilation time window; ct is the middle time of the assimilation time window; [1, p ]]Interval representing the first half of data, [0, q ]]Equal to the time window t 1 ,t n ]The interval of (1); ω and v are the weight matrices of x and y, respectively, bias is the intercept term, x b As a background field, y o Is a known observed value;
s2, replacing assimilation process by using the first multi-layer sensor, analyzing and obtaining data according to history and future
Figure BDA0002822358760000043
Wherein
Figure BDA0002822358760000044
An analysis field representing the ct instant obtained using the first MLP;
s3, comparing the previous training data set with the previous training data set
Figure BDA0002822358760000045
Superposing and training the next multilayer perceptron to obtain
Figure BDA0002822358760000046
Equivalent to in one dataSimulating the prediction model for short-term prediction during the assimilation period, wherein
Figure BDA0002822358760000047
A background field indicating the ct +1 time obtained using the second MLP; (ii) a
S4, carrying out iterative training, repeating the processes of S2 and S3, and generating a new link in each link before ct
Figure BDA0002822358760000048
Before replacement
Figure BDA0002822358760000049
Form a new training data set until ct is reached
Figure BDA00028223587600000410
Is completely replaced, and the background information in the following training data set is only composed of
Figure BDA00028223587600000411
Is provided, wherein
Figure BDA00028223587600000412
Representing the background field obtained by the conventional data assimilation method 3D-Var before the time of ct, if ct is 6, then when t is ∈ [7,12 ∈]When it is, then
Figure BDA00028223587600000413
Represents the background field obtained by the MLP method,
Figure BDA00028223587600000414
representing the background field at time t obtained using the second MLP.
Preferably, ct > 3.
Preferably, the previous input node in S2 and S3 is a vector obtained by superimposing the background information at the time before ct and the observed value at the time before n; the latter vector is the other vector, which contains the previous training data set and an analysis result.
Preferably, one of the output nodes of S2 and S3 is at time t
Figure BDA00028223587600000415
The other output node being at time t +1
Figure BDA00028223587600000416
Wherein
Figure BDA00028223587600000417
Representing the analysis field at time t obtained using the first MLP.
Preferably, each of the multi-layer sensors in S2 and S3 has 3 hidden layers, and the neuron number of each hidden layer is obtained through experiments.
Preferably, in S2 and S3, the result obtained by using ReLU as an activation function is non-linear.
Preferably, the learning rates with appropriate weight attenuation are defined in each of the multi-layered perceptrons in S2 and S3.
Preferably, the loss functions in S2 and S3 are mean square error MSE.
Compared with the prior art, the invention has the following beneficial effects:
1. compared with the pure 3D-Var (three-dimensional variation data assimilation), the intelligent data assimilation method provided by the invention effectively improves the quality of an analysis field and has low calculation cost.
2. During the iterative training process of the models, the models are continuously updated
Figure BDA0002822358760000051
The background error covariance is implicitly corrected to provide more accurate background information for the next analysis field.
3. Compared with the prior art, the 3D-Var of the Lorenz-63 system is increased by 56.44 percent; for the Lorenz-96 system, the 3D-Var is increased by 22.76%; with respect to computational cost, a major advantage of the present invention is that better assimilation results are obtained within an acceptable execution time. Compared with the empirical data assimilation method, the speed is respectively improved by about 3 times and 80 times.
Drawings
FIG. 1 is a block diagram of the preparation of an optimized partial data set;
FIG. 2 is a schematic diagram of HDA-MLP optimized 3D-Var;
FIG. 3 is a data processing block diagram;
FIG. 4 shows the RMSE of different models of the first part of HDA-MLP in the Lorenz-63 system at different time steps;
FIG. 5 shows the RMSE of different models of the first part of HDA-MLP in the Lorenz-96 system at different time steps;
FIG. 6 is a RMSE plot of 50 experiments in the Lorenz-63 system for various DA methods;
FIG. 7 shows x in Lorenz-63 system a And x b RMSE map of (a);
FIG. 8 is a graphical depiction of the trajectory of the state variables in Lorenz-63;
FIG. 9 is a trace schematic of "cold start" in an enlarged view;
FIG. 10 is an enlarged view of a portion of the trace;
FIG. 11 is a RMSE line graph of 50 experiments in a Lorenz-96 system for various DA methods;
FIG. 12 shows x in Lorenz-96 system a And x b RMSE line graph of (a);
FIG. 13 is the average R of the state variables 2 And a standard differential intent;
FIG. 14 is a schematic diagram of a sensor structure;
FIG. 15 is a schematic diagram of an example of an MLP;
fig. 16 is a schematic diagram of an example of the Lorenz-96 model (E ═ 40);
FIG. 17 is a schematic diagram of a data assimilation analysis cycle.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Experiments have shown that historical data can convey time signatures, which are critical to describing the flow dependence of atmospheric variables.
In addition, 4D-Var introduces an assimilation time window, and the core idea is to adopt a time interval [ t ] in a DA period 0 ,t n ]Rather than using only data at a particular time in the 3D-Var. It is based on the short term y o To correct the initial state at the beginning of the assimilation time window, similar to the reverse propagation. Based on the ideas of Cache-MLP and assimilation time window, an HDA-MLP model is provided, and a 3D-Var system is optimized.
Data preparation structure of training data set of optimization part As shown in FIG. 1, setting assimilation time window as [ t ] 1 ,t n ]I.e. analysis using the first half of the data, and correction of the time window [ t ] using the second half of the data 1 ,t n ]The middle analysis field. Thereafter, the current analysis field is determined by the underlying bulletin
Figure BDA0002822358760000071
And (4) optimizing.
Figure BDA0002822358760000072
Wherein
Figure BDA0002822358760000073
Wherein n (n ═ 1,2, 3.) is the length of the assimilation time window; ct is the middle time of the assimilation time window; [1, p ]]Represents the interval of the first half data, [0, q ]]Equal to the time window t 1 ,t n ]The interval of (1); ω and v are weight matrices of x and y, respectively, bias is the intercept term, x b As a background field, y o For known observed values, x in the present invention b And y o The expressions are meant to be identical, but there are different expressions, which are adapted to different expression scenarios,namely, the priori knowledge is equivalent to a background field and is also equivalent to an initial guess field; y is o The same is true.
As shown in fig. 2 (assuming ct>3) And constructing a chain MLP model to optimize the result of the 3D-Var. The following is an explanation of its working principle. In link1, the first step is to replace the assimilation process with the first MLP, analyzed and derived from historical and future data
Figure BDA0002822358760000074
The next step is to sum the previous training data set with
Figure BDA0002822358760000075
Superposing and training the next MLP to obtain
Figure BDA0002822358760000076
This is equivalent to simulating a prediction model over a DA period for short-term prediction. Next, iterative training is performed, the steps being the same as link 1. It should be noted that, in the links before ct, each link generates a new link
Figure BDA0002822358760000077
Before replacement
Figure BDA0002822358760000078
Form a new training data set until ct is ahead
Figure BDA0002822358760000079
Is completely replaced, and the background information in the following training data set is only composed of
Figure BDA00028223587600000710
Provided is a method. Namely, the MLP model is made to be an independent innovative assimilation system, and the data of the 3D-Var is not depended on, and the 3D-Var only provides the first ct data
Figure BDA00028223587600000711
To "cold start" the proposed model. During the iterative training process of the models, the model is continuously trainedUpdating
Figure BDA00028223587600000712
The background error covariance is implicitly corrected to provide more accurate background information for the next analysis field.
The optimal configuration of HDA-MLP to 3D-Var is defined by empirical test and is obtained
Figure BDA00028223587600000713
And
Figure BDA00028223587600000714
are similar because their training data sets are nearly identical in number, wherein
Figure BDA0002822358760000081
An analysis field representing the ct instant obtained using the first MLP;
Figure BDA0002822358760000082
the background field representing the ct +1 instant obtained using the second MLP;
Figure BDA0002822358760000088
representing the background field obtained by the conventional data assimilation method 3D-Var before the time of ct, if ct is 6, then when t is ∈ [7,12 ∈]When it is, then
Figure BDA0002822358760000083
Figure BDA0002822358760000089
Representing background fields obtained by the MLP method for distinguishing background fields obtained by the 3D-Var method of conventional data assimilation
Figure BDA0002822358760000084
Figure BDA00028223587600000810
Representing the background field at time t obtained using the second MLP;
Figure BDA0002822358760000085
representing the analysis field at time t obtained using the first MLP.
One link in the model has the following features:
1) the previous input node is a vector which is obtained by superposing the background information of the moment before ct and the observed value of the moment before n; the latter vector is another vector, which contains the previous training data set and an analysis result;
2) one output node at time t
Figure BDA0002822358760000086
The other output node being at time t +1
Figure BDA0002822358760000087
3) Each MLP has 3 hidden layers, and the neuron number of each hidden layer is obtained through experiments;
4) taking ReLU as an activation function to obtain a nonlinear result;
5) defining a learning rate with an appropriate weight decay in each MLP;
6) the loss function is Mean Square Error (MSE);
examples
1. Building a data set
In general, two classical nonlinear chaotic power systems are used for carrying out an observation system simulation experiment to check the effectiveness of the DA method. In the present invention, the proposed HDA-MLP method was validated using the Lorenz-63 and Lorenz-96 systems.
Assuming that the Lorenz-63 system is perfect and there is no model error, so the model error covariance matrix Q is 0; given an initial state x 0 =1.508870,y 0 =-1.537121,z 0 25.46091, and time-integrates the Lorenz-63 model using a fourth-order Runge-Kutta time difference format, time step d t The analog integration step size is 111000, 0.01. Then, without assimilation, can be passed
Figure BDA0002822358760000091
The formula is integrated to obtain a real field (x) t ). In a true atmosphere model, the ground truth value of a state variable generally represents x t . At x t Adding errors in random obedience Gaussian distribution to obtain an approximate observation field (y) o ). Furthermore, a linear observation operator h ═ I and an observation error covariance matrix R ═ I are provided, where I is a third order unitary matrix.
Similarly, under the same ICs, a pure 3D-Var DA system completes 111000 analysis cycles, generating a background field (x) b ) And analysis field (x) a ). In order to achieve the stable state of the Lorenz-63 system and obtain approximate simulated atmosphere, the cold start of the first 1000 integration time steps is needed, the data of the middle 100000 steps is used for preparing a training data set of HDA-MLP, and the data of the last 10000 steps is test data. Fig. 3 is a structure of data preparation.
Furthermore, to avoid the chance and randomness of the experiment, the same initial state, x ', is set for the 3D-Var DA system' 0 =1.508870,y′ 0 =-1.537121,z′ 0 25.46091, which is the 101001 th chaotic state obtained by 101000 step integration with the initial state mentioned in the data preparation process. 10000 cycles of analysis were then continued and repeated 50 times in preparation for testing the data set.
In the Lorenz-96 system, the steps and procedures for generating simulation data are the same as in the Lorenz-63 system, except that the number of state variables is increased from 3 to 40 in the Lorenz-63 system, and the corresponding initial states are different.
2. Evaluation index
In order to measure the assimilation and prediction performance of the model, the effect of the novel DA method was evaluated using Root Mean Square Error (RMSE) and Goodness of Fit (Goodness of Fit).
RMSE denotes the analysis field x a With the true value x t Sample standard deviation of the difference, which is x a And x t The square root of the ratio of the deviation between and the observed number N. RMSE is always non-negative and a value of 0 indicates that the target data fits perfectly. When x is a And target x t When there is a deviation between them, an analysis error occurs. The role of RMSE is to combine the errors at different times into one metric with analytical capabilities. Generally, a lower RMSE is better than a higher RMSE. Therefore, RMSE is widely used in the DA-related research literature to compare the performance of different DA processes. The lower the RMSE, the better the performance of the DA process; otherwise, the larger the analytical error of the DA method. In the present invention, RMSE is defined as:
Figure BDA0002822358760000101
where i is the ith state variable and e is the e-th set member.
Goodness of fit refers to the degree of fit of the regression line to the observed value. The statistic for measuring the goodness of fit is the determinable coefficient R 2 ,R 2 Has a maximum value of 1, R 2 The closer the value is to 1, the better the fit of the regression line to the observed value. R 2 Has a maximum value of 1, R 2 The closer the value of (1) is, the better the fitting degree of the regression line and the observed value is; otherwise, R 2 The smaller the value of (a) is, the worse the fitting degree of the regression line to the observed value is.
Figure BDA0002822358760000102
Figure BDA0002822358760000103
Figure BDA0002822358760000104
Wherein the Total Square Sum (TSS) calculates the total square sum of the samples; residual Sum of Squares (RSS) computes the sum of squares of the residuals, i.e. the sum of squares of the errors.
3. Selection of time step
From the DA period (fig. 3), it can be seen that,
Figure BDA0002822358760000105
from time t
Figure BDA0002822358760000106
And
Figure BDA0002822358760000109
the coupling is calculated directly, but in practice,
Figure BDA0002822358760000107
is to be
Figure BDA0002822358760000108
Short-term predictions obtained as ICs of the prediction model, i.e.
Figure BDA0002822358760000111
And
Figure BDA0002822358760000112
in the context of a correlation, the correlation,
Figure BDA0002822358760000113
is dependent on
Figure BDA0002822358760000114
And
Figure BDA0002822358760000115
in theory, these correlations are also true for longer periods of the DA cycle. The present invention therefore defines the problem as a time series problem that relies on historical meteorological data, including observation fields and ambient fields. It is common to solve the time series problem with the ML method, but the time step (timeout) is an important parameter for the model to learn the time information. In general, the longer the time step, the more information in the past and future can be used for prediction. But storeOne key issue is whether the accumulation of errors will be larger than the accumulation of valid information. Therefore, the invention makes preliminary experiments to select the optimal time step for subsequent research. As shown in FIGS. 4 and 5, in both Lorenz-63 and Lorenz-96 systems, timekeep 6+11 is better than others, where 6 represents the history y of the first 6 state variables o And x b It provides past information, then 5 future y o Adapted to correct the current x a . A total of 6 x b And 11 y o Together forming a training data set, input data for a neural network to obtain x a . the construction principle of the timer 4+7 and 8+15 is the same as that of the timer 6+11, but the former has less effective information and the latter has more accumulated errors compared with the timer 6+11, which is disadvantageous to the following research. The invention selects timekeep to be 6+11 to construct the training data set of the proposed model.
4. Analysis of experiments
In this section, to prove the validity and effectiveness of the proposed model, we first take the Lorenz-63 system as an example, which is a simple chaotic system with low dimension. The method comprises the steps of optimizing analysis fields of a traditional DA method and a 3D-Var by using HDA-MLP, comparing results of existing pure 3D-Var and Cache-MLP methods, analyzing whether the optimization aims can be achieved or not, drawing a regression curve, calculating goodness of fit, investigating the capability of each method for tracking a real field of a model, and evaluating the superiority of the proposed model.
4.1 analysis of the results of the experiments in the Lorenz-63 System
For HDA-MLP, first, FIG. 6 compares the 50 analytical fields of the Lorenz-63 system calculated by different DA methods, including pure 3D-Var, Cache-MLP and HDA-MLP. In the figure, the solid line marked with a circle represents pure 3D-Var; the Cache-MLP is represented by a solid line marked by a triangle; solid lines with square marks represent HDA-MLP. Table 1 reports the average RMSE of 50 experiments per method (bold indicates best results). By comparison, the minimum RMSE of 0.2105 can be achieved by HDA-MLP. Although the accuracy of 3D-Var is improved by 39.85% by Cache-MLP, the accuracy can be improved by 55.26% even by using HDA-MLP. This is possible because the HDA-MLP can use a short-term future training data set to extract more temporal features and information.
Table 1: comparison of the results of the experiment of FIG. 6
Method RMSE Improvement
Pure 3D-Var 0.4705 /
Cache-MLP 0.2830 39.85%
HDA-MLP 0.2049 56.44%
Furthermore, it was found that o Next, x obtained by HDA-MLP b Improvement of precision and x a Similarly (table 2). This is because HDA-MLP not only uses MLP to optimize the assimilation process, but also uses another MLP to simulate the pattern prediction process. The MLP training data set used for simulation model prediction contains historical data for many moments and a more accurate analysis field. In the chain iterative training process, more time information can be mined and B can be updated implicitly, which helps to obtain a useful simulator. Experimental results show that the trained model can be used quicklyModel prediction is optimized quickly and efficiently (fig. 7).
Table 2: comparison of the results of the experiment of FIG. 7
Figure BDA0002822358760000121
In addition, the best method for judging whether the DA method can well track the true value of the model is to draw a tracking track. As shown in fig. 8, a rough qualitative analysis shows that each DA method can track true values for all model three state variables (x, y, z). Wherein the solid line is model x t Dotted line is x of pure 3D-Var a And the dotted line segment is x of Cache-MLP a The dotted line combined with the dotted line is x of HDA-MLP a . Table 3 shows that all methods fit better than 0.99 in each state variable, quantitatively indicating that each method fits the true value of the model well (bold represents the best results). However, in HDA-MLP, R 2 Slightly larger in each state variable, which means that the model performs better. Some partial sequences of the traces are enlarged and may reflect the details of fig. 8. As shown in fig. 9, the "cold start" period is an important indicator of whether the DA process can bring the assimilation system into steady state quickly. It can be seen that pure 3D-Var and Cache-MLP perform better than HDA-MLP because the former relies on a seemingly better static B, which is computed once from a large amount of meteorological data, while HDA-MLP starts with static B but updates it implicitly to accommodate future assimilation, thus requiring a longer adaptation phase. Furthermore, in fig. 10, all DA methods performed well in monotonically increasing or decreasing trace portions (fig. 10 (a) and fig. 10 (f)), which also yields R 2 Verification of specific values, in Table 4, R 2 Is much higher than 0.99 (columns a and f). However, some corners of the trace are seen (fig. 10 (b) and (e)), which represent abrupt changes in the state variables, to which pure 3D-Var cannot adapt, and Cache-MLP performance is slightly better, but with obvious drawbacks. In contrast, HDA-MLP was able to capture changes and obtain the highest R in most cases 2 (bold in Table 4). The result shows that the proposed model can learn the change of the state variable from the time series training data set, and has better robustness than a pure 3D-Var and a numerical model which are based on ideal assumption and describe the change of the state variable by integration.
Table 3: goodness of fit of various method state variables
Figure BDA0002822358760000131
Table 4: goodness of fit of partial trajectories for different methods
Figure BDA0002822358760000141
4.2 analysis of the results of the experiments in the Lorenz-96 System
For HDA-MLP, FIG. 11 shows a comparison of the results of 50 experiments between the various DA methods of Lorenz-96 (note labels are the same as for the Lorenz-63 system). Pure 3D-Var is represented by a solid line marked with a circle; Cache-MLP is represented by solid lines marked with triangles; HDA-MLP is represented by the square-marked solid line). The models can improve assimilation accuracy with lower RMSE. However, the degree of performance improvement is relatively small, as shown in table 5, which may be caused by the complexity of the Lorenz-96 system itself. As shown in FIG. 17, the Lorenz-96 model includes 40 state variables (x) around the earth 1 ,x 2 ,...,x 40 ) More spatial features and therefore more spatial dependencies between state variables than the Lorenz-63 system. Only the training data collected in a longer time step is considered to obtain more time information, and the construction of a better model for extracting the spatial features between variables is omitted. This may be the possible reason why the proposed model only slightly improves accuracy. In the present invention, the proposed model, as shown in fig. 12, uses time information to simulate and optimize the assimilation and model prediction processes of the Lorenz 96 system, which has proven to be effective. Similarly, HDA-MLP is implemented in the background field of 3D-VarA considerable improvement was observed with an improvement of 28.61% (table 6). More accurate x b And the method greatly contributes to optimizing an analysis field in the next assimilation. However, the improvement of the Lorenz-96 system is weaker than the Lorenz-63 system due to the accumulation of errors for more state variables.
Table 5: comparison of the results of the experiment of FIG. 11
Method RMSE Improvement
Pure 3D-Var 0.5283 /
Cache-MLP 0.4283 18.93%
HDA-MLP 0.4080 22.76%
Table 6: comparison of the results of the experiment of FIG. 12
Figure BDA0002822358760000151
Furthermore, the average RMSE may reflect the assimilation of all state variables in the Lorenz-96 system by the various DA methods, but R 2 But can show every state changeThe degree of fit of the quantities, and therefore the average R of all 40 state variables was calculated 2 And standard deviation (STD) and plotted in FIG. 13 (dotted labeled pure 3D-Var, circled on straight labeled HAD-MLP, cross labeled Cache-MLP). Intuitively, better R can be obtained by HDA-MLP 2 R of each state variable 2 Are all close to 1. That is, the HDA-MLP can better fit each state variable. Meanwhile, the STD of HDA-MLP is smaller compared with other methods (except static pure 3D-Var), which shows that the result of HDA-MLP is more concentrated near the average value and the model is more stable. The proposed model is trained by big data containing a large amount of state variable changes, and the neural network can learn the complex relation and time characteristics among the state variables, so that the HDA-MLP can adapt to more test data sets. In general, compared with pure 3D-Var and Cache-MLP, the proposed model can reach the highest R on the premise of ensuring the stability of the DA system 2 The state variables can be fitted well.
5. Calculating the cost
DA is the process of the assimilation and long-short term prediction iterative loop. It is important to improve the quality of DA. However, the speed of the DA process is also an essential factor in determining the utility and effectiveness of the hybrid DA process. The ultimate goal of the DA is to quickly and accurately obtain the best ICs for NWP. The computing device used in all experiments of the invention was configured as GEFORCE RTX 2060, Intel CORE i 7. Table 7 below shows the calculated costs for each mixing method under different models.
Table 7: computational cost of various hybrid methods to different models
Figure BDA0002822358760000161
Previous hybrid approaches Cache-MLP and backpacked-MLP take the least time because it is only a forward propagation process after model training is complete. However, it relies on the pure 3D-Var background field, i.e. during the test, it is necessary to wait for the pure 3D-Var to complete the short-term prediction to obtain the background field for further test experiments, which is not added in Table 7. However, the HDA-MLP used to optimize the 3D-Var results only requires the first ct (a certain time step) background field to be "cold started". Then, it obtains background fields through an iterative loop, and then predicts the analysis field of the next time step using these background fields. In general, the novel model of (a) can yield higher quality results with an acceptable loss of time in a given integration step.
Inspired by the concept of four-dimensional variational data assimilation, the invention provides an innovative DA method named HDA-MLP. In the present invention, the DA can be defined as a time series problem, relying on historical weather data comprising the observed field and the first guess field. Furthermore, as with 4D-Var, to obtain a more robust hybrid DA system, a short term prediction is added to the HDA-MLP to correct the results at the current time. Under the guidance of the time characteristics of the state variables, HDA-MLP does optimize the analysis field of the conventional DA system in two classical nonlinear dynamical models by implicitly updating the background error covariance. Specifically, for the Lorenz-63 system, the 3D-Var increased by 56.44%; for the Lorenz-96 system, the 3D-Var increased by 22.76%.
The principle of the invention is as follows:
1. three-dimensional variation data assimilation (3D-Var)
Based on Bayes' theorem, the variational approach has been the dominant business approach to NWP in the past 90 s of the 20 th century. Typically, the 3D-Var finds the analytic field by estimating a single initial state of the NWP model by minimizing a cost function L. The cost function L is expressed as:
Figure BDA0002822358760000171
wherein x represents a state variable; x is a radical of a fluorine atom b A priori knowledge (also known as background fields) representing state variables; y is o Is a known observed value; h is an observation operator, h (x) is intended to calculate the values in the observation points using the model variables in the grid points; b is a background error covariance matrix, and a background field x is derived from known laws of atmospheric dynamics and physics b Correlation with model grid points; r is an observation error covariance matrix which can be obtained from the observation error of the instrument to describe the observed value y o And the correlation between the model grid points, and T represents the transposition operation of the matrix.
The cost function L is introduced as follows:
L b for measuring x and x b Even if x fits x based on the background error covariance B b
L o For measuring x and x o Even x fits y based on the observation error covariance R o Where h is the observation operator, which performs the necessary interpolation and converts the mode variables into observed spatial field values.
The minimum value of formula (1), i.e. the jacobian vector formula satisfying formula (2), is obtained by the nonlinear conjugate gradient method:
Figure BDA0002822358760000173
wherein H is a tangent operator of the observation operator H, defined as:
Figure BDA0002822358760000172
according to formula (2), x a Is the state that minimizes the function L and is solved by equation (4):
[B -1 +H T R -1 h]x a =B -1 x b +H T R -1 y o (4)
the optimal analysis field is then:
x a =[B -1 +H T R -1 h] -1 [B -1 x b +H T R -1 y o ]
=[B -1 +H T R -1 h] -1 [B -1 x b +H T R -1 h(x b )-H T R -1 h(x b )+H T R -1 y o ] (5)
=x b +[B -1 +H T R -1 h] -1 H T R -1 (y o -h(x b ))
wherein, h (x) b ) Perform the necessary interpolation and change the mode variable x b Conversion into values of a corresponding observed spatial field
Minimizing the cost function L can be regarded as an inverse problem, since the construction and inversion operation of the error covariance matrix B, R are difficult points of three-dimensional variation and assimilation, and the inversion of the high-dimensional matrix B, R is very difficult, and requires a large amount of computing resources and storage space, when actually solving this problem, the equation (5) is not directly solved, but gradient information of the objective function is obtained through proper transformation, and then an optimization method (such as newton method, quasi-newton method, BFGS method, LBFGS method, etc.) is adopted to iteratively solve the approximate minimum value of the equation (5).
2. Multilayer perceptron (MLP)
On the basis of simulating human brain, F.Rosenblatt simply abstracts biological nerve cells, and designs a first computer neural network, namely a perception engine. The perceptron is a linear classifier, namely, a linearly separable data set, and a better linear equation can be fit by the perceptron in a learning mode. However, the perceptron model expression capability of single neurons is not sufficient to solve more complex problems than simple linear regression. The multi-layer perceptron has a fully connected structure, and each node except the input layer can be regarded as a processing unit with a non-linear activation function. If each activation function is a linear function, each node of the MLP in any layer can be reduced to a perceptron. Namely, MLP is a classical deep neural network, which is a general class of approximators. That is, providing enough hidden units, an MLP with at least one hidden layer can approximate any measurable function f (x) using an appropriate activation function.
An example of the construction of the sensor and MLP is shown in fig. 14 and 15.
3. Classical nonlinear dynamic model
3.1 Lorenz-63 System-Low dimensional chaos model
The Lorenz-63 system is a simplified mathematical model of atmospheric convection. The nonlinear Lorenz three-variable model system has the advantages of simple calculation and strong nonlinear interaction between variables. The state evolution in the Lorenz-63 system can be given by the following three general differential equations:
Figure BDA0002822358760000181
where the dots represent the derivative with respect to time. And integrating the model by using the typical values of the parameters sigma, beta and rho which are respectively 10, 8.3 and 28 and are described in the classical Lorenz paper to obtain the chaotic power system.
The equation relates to the behavior of a two-dimensional fluid layer that heats up uniformly from the bottom up and cools down uniformly from the top down. In particular, the equation describes the rate of change of three quantities with time: x is proportional to the convection velocity, y is proportional to the horizontal temperature change, and z is proportional to the vertical temperature change.
3.2 Lorenz-96 System-high-dimensional chaos model
Lorenz-96 is a more complex nonlinear kinetic system than Lorenze-63. Its function can be defined as equation (7). Wherein, the first term on the right is an advection term, and the second term is a damping term. Lorenz-96 is a periodic system that is commonly used to model the time evolution of atmospheric variables.
Figure BDA0002822358760000191
Where J is the number of state variables. Fig. 16 shows a connection line of the true states of 40 state variables of the Lorenz-96 model at a given time when J is 40, the state variables being uniformly distributed on the equator of the earth, F representing an externally forced constant set to 8, which is a known general value that forms chaotic behavior.
4. Problem definition
In business systems, short-term forecasting is usedAs a first guess field (x) b ) In combination with the new observation (y) o ) To obtain an analysis field (x) a ) Referred to as a data assimilation analysis cycle (see FIG. 17). Wherein, y o After real-time measurement of different devices, the measurement result is obtained by inversion of an empirical function. The prediction model is based on the x of the current time a Short term prediction for IC, giving a new x b . Thus, a complete analysis cycle consists of assimilation and model prediction, which are repeated.
In 3D-Var, the goal is to find the best weights for two sources within a single time t, as in equation (8).
Figure BDA0002822358760000192
Based on the first order Markov assumption, a traditional DA can be defined as a regression problem that relies only on variables at a single time. However, current x a This redefines the DA as a time series problem, subject to multiple time instants of historical data. The invention extends and takes into account future short-term predictions x b Is derived from the previous x a Evolved so that the future short-term y which can be measured independently can be utilized according to the dynamic property of the DA o And x corrected at the same time b To correct the current x a . An obvious difficulty with the model is finding a way to integrate historical x b And y o And future y o To obtain more accurate x for the new HDA-MLP model a . Thus, the assimilation process in a single DA method can be defined as a regression problem based on multiple time-varying variables, and the model can be represented by equation (9).
x a =f(historical(x b ,y o ),future(y o )) (9)
Where f is the DA model learned from the training data set.
In addition, x is obtained a The next process for DA is then short-term prediction by the predictive model. It is known that those predictive models are those havingThe mathematical partial differential equation of semi-empirical nature can also be regarded as a regression problem. However, solving partial differential equations using mathematical physics is difficult and time consuming. In order to speed up the process of model prediction, a generic approximator MLP is used to simulate and replace the prediction model.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (8)

1. An intelligent data assimilation method based on a variation assimilation framework is characterized by comprising the following steps:
s1, analyzing the first half of the data of the assimilation window, and correcting the assimilation time window [ t ] with the second half of the data 1 ,t n ]Intermediate analytical field of formula
Figure FDA0003649071430000011
Determining the analysis field of the current time
Figure FDA0003649071430000012
In which
Figure FDA0003649071430000013
n is the length of the assimilation time window, and n is 1,2 and 3; ct is the middle time of the assimilation time window; [1, p ]]Represents the interval of the first half data, [0, q ]]Equal to the time window t 1 ,t n ]The interval of (1); ω and v are weight matrices of x and y, respectively, bias is the intercept term, x b As a background field, y o Is a known observed value;
s2, replacing assimilation process by using the first multi-layer sensor, analyzing and obtaining data according to history and future
Figure FDA0003649071430000014
Wherein
Figure FDA0003649071430000015
An analysis field representing the ct instant obtained using the first MLP;
s3, comparing the previous training data set with
Figure FDA0003649071430000016
Superposing and training the next multilayer perceptron to obtain
Figure FDA0003649071430000017
Equivalent to simulating a predictive model for short-term prediction during a data assimilation period, wherein
Figure FDA0003649071430000018
The background field representing the ct +1 instant obtained using the second MLP;
s4, carrying out iterative training, repeating the processes of S2 and S3, and generating a new link in each link before ct
Figure FDA0003649071430000019
Before replacement
Figure FDA00036490714300000110
Form a new training data set until ct is ahead
Figure FDA00036490714300000111
Is completely replaced, and the background information in the following training data set is only composed of
Figure FDA00036490714300000112
Is provided, wherein
Figure FDA00036490714300000113
Represents the time before ct by the traditionIf ct is 6, then when t is the [7,12 ] is in the background field obtained by the 3D-Var method of data assimilation]When it is, then
Figure FDA00036490714300000114
Figure FDA00036490714300000115
Represents the background field obtained by the MLP method,
Figure FDA00036490714300000116
representing the background field at time t obtained using the second MLP.
2. The method of claim 1, wherein ct > 3.
3. The method of claim 1, wherein the previous input node of S2 and S3 is a vector obtained by superimposing the background information of time before ct and the observed value of time n before ct; the latter vector is the other vector, which contains the previous training data set and an analysis result.
4. The method of claim 1, wherein one of the output nodes of S2 and S3 is at time t
Figure FDA0003649071430000021
The other output node being at time t +1
Figure FDA0003649071430000022
Wherein
Figure FDA0003649071430000023
Representing the analysis field at time t obtained using the first MLP.
5. The method of claim 1 wherein each of said S2 and S3 has 3 hidden layers, and the number of neurons in each hidden layer is experimentally obtained.
6. The method of claim 1 wherein the results of S2 and S3 are nonlinear with ReLU as the activation function.
7. The method of claim 1 wherein each of said S2 and S3 defines a learning rate with appropriate weight decay in each of said multi-level sensors.
8. The method of claim 1 wherein the loss function in S2 and S3 is Mean Square Error (MSE).
CN202011420985.5A 2020-12-08 2020-12-08 Intelligent data assimilation method based on variational and assimilative framework Active CN112464567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011420985.5A CN112464567B (en) 2020-12-08 2020-12-08 Intelligent data assimilation method based on variational and assimilative framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011420985.5A CN112464567B (en) 2020-12-08 2020-12-08 Intelligent data assimilation method based on variational and assimilative framework

Publications (2)

Publication Number Publication Date
CN112464567A CN112464567A (en) 2021-03-09
CN112464567B true CN112464567B (en) 2022-08-12

Family

ID=74801674

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011420985.5A Active CN112464567B (en) 2020-12-08 2020-12-08 Intelligent data assimilation method based on variational and assimilative framework

Country Status (1)

Country Link
CN (1) CN112464567B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113609790B (en) * 2021-10-11 2021-12-03 成都数联云算科技有限公司 Product virtual measuring method, system, device and medium
CN115630566B (en) * 2022-09-28 2024-05-07 中国人民解放军国防科技大学 Data assimilation method and system based on deep learning and dynamic constraint

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101471378B1 (en) * 2013-11-18 2014-12-10 재단법인 한국형수치예보모델개발사업단 Spectral transformation method for three-dimension variational data assimilation of numerical weather prediction model using cubed-sphere grid based on spectral element method and hardware device performing the same
WO2016086329A1 (en) * 2014-12-01 2016-06-09 哈尔滨工程大学 Actually-measured marine environment data assimilation method based on sequence recursive filtering three-dimensional variation
CN110211325A (en) * 2019-06-27 2019-09-06 上海同望信息技术有限公司 A kind of area road icing high precision monitor early warning system based on meteorological big data
CN111783361B (en) * 2020-07-07 2021-03-12 中国人民解放军国防科技大学 Numerical weather forecast mixed data assimilation method based on triple multi-layer perceptron

Also Published As

Publication number Publication date
CN112464567A (en) 2021-03-09

Similar Documents

Publication Publication Date Title
CN110689179A (en) Water bloom prediction method based on space-time sequence mixed model
CN111461453B (en) Medium-and-long-term runoff ensemble forecasting method based on multi-model combination
CN112464567B (en) Intelligent data assimilation method based on variational and assimilative framework
CN111783361B (en) Numerical weather forecast mixed data assimilation method based on triple multi-layer perceptron
CN110276441B (en) Trapezoidal overlapped kernel pulse estimation method based on deep learning
CN113269314B (en) New energy power generation scene data migration method based on generation countermeasure network
CN104091216A (en) Traffic information predication method based on fruit fly optimization least-squares support vector machine
CN112381282B (en) Photovoltaic power generation power prediction method based on width learning system
CN115935834A (en) History fitting method based on deep autoregressive network and continuous learning strategy
CN112818595A (en) Method and system for correcting digital twin model data of evaporation zone of thermal power plant
CN112989711B (en) Aureomycin fermentation process soft measurement modeling method based on semi-supervised ensemble learning
Huang et al. A data-driven method for hybrid data assimilation with multilayer perceptron
CN115270239A (en) Bridge reliability prediction method based on dynamic characteristics and intelligent algorithm response surface method
CN115952685B (en) Sewage treatment process soft measurement modeling method based on integrated deep learning
CN116502539B (en) VOCs gas concentration prediction method and system
CN116303786B (en) Block chain financial big data management system based on multidimensional data fusion algorithm
CN108876038A (en) Big data, artificial intelligence, the Optimization of Material Property method of supercomputer collaboration
CN110852415B (en) Vegetation index prediction method, system and equipment based on neural network algorithm
Springer et al. Robust parameter estimation of chaotic systems
Xiao et al. FengWu-4DVar: Coupling the Data-driven Weather Forecasting Model with 4D Variational Assimilation
CN115796327A (en) Wind power interval prediction method based on VMD (vertical vector decomposition) and IWOA-F-GRU (empirical mode decomposition) -based models
CN115018137A (en) Water environment model parameter calibration method based on reinforcement learning
CN115545159A (en) Average sea surface temperature forecasting method of deep neural network
CN114676887A (en) River water quality prediction method based on graph convolution STG-LSTM
Bakumenko et al. Synthesis method of robust neural network models of systems and processes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant