CN112464567A - Intelligent data assimilation method based on variational and assimilative framework - Google Patents

Intelligent data assimilation method based on variational and assimilative framework Download PDF

Info

Publication number
CN112464567A
CN112464567A CN202011420985.5A CN202011420985A CN112464567A CN 112464567 A CN112464567 A CN 112464567A CN 202011420985 A CN202011420985 A CN 202011420985A CN 112464567 A CN112464567 A CN 112464567A
Authority
CN
China
Prior art keywords
assimilation
time
mlp
data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011420985.5A
Other languages
Chinese (zh)
Other versions
CN112464567B (en
Inventor
黄丽蓝
宋君强
任开军
李小勇
冷洪泽
王东紫
卢竞择
陈睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202011420985.5A priority Critical patent/CN112464567B/en
Publication of CN112464567A publication Critical patent/CN112464567A/en
Application granted granted Critical
Publication of CN112464567B publication Critical patent/CN112464567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/12Timing analysis or timing optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/14Force analysis or force optimisation, e.g. static or dynamic forces
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention discloses an intelligent data assimilation method based on a variation and assimilation framework, comprising the steps of S1, analyzing data of the first half part of an assimilation window, and correcting an assimilation time window [ t ] by using data of the second half part1,tn]Intermediate analysis field, determining the analysis field at the current moment
Figure DDA0002822358780000011
S2 replacement assimilation process using the first multi-layered sensor, analyzing and obtaining the data according to history and future
Figure DDA0002822358780000012
S3 comparing the previous training data set with
Figure DDA0002822358780000013
Superposing and training the next multilayer perceptron to obtain
Figure DDA0002822358780000014
The method is equivalent to the method that a prediction model is simulated in a data assimilation period to carry out short-term prediction; s4, repeating S2 and S3, and generating a new link in each link before ct
Figure DDA0002822358780000015
Before replacement
Figure DDA0002822358780000016
Form a new training data set until ct is ahead
Figure DDA0002822358780000017
Is completely replaced, and the background information in the following training data set is only composed of
Figure DDA0002822358780000018
Provided is a method.

Description

Intelligent data assimilation method based on variational and assimilative framework
Technical Field
The invention relates to the technical field of data assimilation, in particular to an intelligent data assimilation method based on a variational and assimilative framework.
Background
Numerical Weather Prediction (NWP) is a mainstream weather forecasting approach, which is defined by Bjerknes as an initial/boundary value problem. The future state of the atmosphere sensitively depends on its detailed Initial Conditions (IC), certain Boundary Conditions (BC) and a perfect forecasting model. The prediction model is mostly a mathematical physical model approximately describing the law of motion of the earth system, and is usually expressed by a differential equation. Generally, given the ICs and BCs of a forecasting model, the numerical solution of the forecasting model represents the corresponding numerical weather forecast. With the improvement of theory and the improvement of computing power, the precision of forecasting models and BCs is higher and higher. Therefore, the uncertainty of ICs becomes a factor of NWP prediction error. Data Assimilation (DA) is the integration of all available Data resources, including observations and short-term forecasts (i.e., background fields or initial guesses), to obtain an analysis field, which is expected to be the best estimate of the probability density of the actual atmospheric state. That is, the goal of the DA is to produce the most accurate IC for NWP. Therefore, the establishment of an optimal DA method is of high interest to researchers in the field of meteorology.
Based on bayesian theorem, the variational assimilation method (Var) was the predominant business method for NWP in the 20 th century before the 90 s. Basically, to handle uncertainties from different information, the guiding strategy for the variant assimilation approach is: the new observations are used to correct the initial guess at the specified grid points, i.e. the background field and the observations are combined by minimizing a cost function, which consists of two aspects: one is a penalty term for the background mean distance and the other is the distance to the observed value. The specific implementation way is to directly assimilate a large amount of observation data by providing a static background error covariance matrix (B), wherein the matrix B is calculated by climate data once and cannot well represent the real space-time error statistics, which means that the matrix B may introduce a large background error. Var is divided into 3D-Var, which uses single-time data, and 4D-Var, which uses data for a time window of a given length to estimate the initial state. Although in 4D-Var B implicitly evolves with flow dependent changes and a better initial state than 3D-Var can be obtained, it requires expensive computational expense and requires the building of linear and adjoint numerical models, which is difficult to build.
As a data-driven approach, Machine Learning (ML) is the study of computer algorithms that are automatically improved by experience. In a computer system, experiences mainly exist in the form of data, and ML establishes a mathematical model according to sample data to predict and make decisions. ML has enjoyed significant success in various fields of computer vision, speech recognition and control systems, as well as in related scientific fields of physics, chemistry and biology. The development of ML has brought new opportunities for DA, and some researchers have attempted to explore suitable ML algorithms to solve local problems of traditional and single DA approaches. As one of the geosciences system problems, DA mainly includes three stages, and errors still occur at each stage, and there is no general DA technique that can ideally eliminate the errors. The first stage of DA is the pre-processing and quality control of the observed data, in order to eliminate the errors of the observers, researchers usually use an ideal function to invert the indirect observed data, and use an empirical observation operator (matrix) to interpolate the observed data into a numerical grid, which introduces errors that cannot be underestimated. ML can learn the time-space characteristics and the correlation of data, which has important significance for improving errors introduced by observed values. The second stage is objective analysis, integrating the observed data with the background information provided by the numerical model. Generally, functions of mathematical models for simulating the evolution of the earth system based on physical theory employ empirical parameters, having semi-empirical properties, and therefore, these numerical models can be replaced with ML. The ML method has been shown to be more powerful and flexible than previous mechanistic or semi-empirical modeling methods. Furthermore, compared to current sequential assimilation of DA, ML is parallelizable, which is very helpful to speed up the assimilation process. The third stage is initialization, and the result of target analysis needs to be properly adjusted to avoid the problem that subsequent integration cannot be performed due to noise generated by inconsistent prediction models. Therefore, the output of the DA depends on the uncertainty of the numerical model. Based on a large amount of earth system data, the model error is corrected by using the ML method, aiming at improving the quality of the DA result, and some researchers pay attention.
The final goal of data assimilation is to quickly and accurately obtain the optimal analysis field and provide the optimal initial field for the numerical weather forecast mode. The existing variational assimilation method is based on the Bayesian theory, and has the advantages of directly assimilating a large amount of observation data and simple basic framework. But the storage and direct calculation of the large matrix B are involved, and high requirements are imposed on the storage capacity and the calculation capacity of a calculation device; furthermore, the results obtained by the variational assimilation method, i.e., the analysis field, are generally not the best results for the analysis field, since they are based on a variety of ideal assumptions. In other words, the existing DA directly depends on the quality of the observed field and the background field. The former is measured by various observation devices and obtained by a specific empirical inversion function, and the latter is provided by a numerical model. There occur inevitable errors that need to be eliminated. Based on the first-order Markov assumption and Bayes theorem, the integration and assimilation processes of the numerical model are sequentially performed and are not parallel, so that an assimilation period of the traditional DA method is long and needs to be accelerated.
Disclosure of Invention
To build a general model of the DA, the historical y is usedoAnd xbAnd future yoIncluded therein, optimizing current x in a DA systemaSecondly, different models are used for simulation, the process of model prediction is accelerated, the technical problems in the prior art are solved, and the invention provides an intelligent data assimilation method based on a variational and assimilative framework.
The technical scheme of the invention is as follows: an intelligent data assimilation method based on a variation assimilation framework comprises the following steps:
s1, analyzing the first half of the data of the assimilation window, and correcting the assimilation time window [ t ] with the second half of the data1,tn]Intermediate analytical field of formula
Figure BDA0002822358760000031
Determining the analysis field of the current time
Figure BDA0002822358760000041
In which
Figure BDA0002822358760000042
n (n ═ 1,2, 3.) is the length of the assimilation time window; ct is the middle time of the assimilation time window; [1, p ]]Interval representing the first half of data, [0, q ]]Is equal to timeMiddle window [ t ]1,tn]The interval of (1); ω and v are the weight matrices of x and y, respectively, bias is the intercept term, xbAs a background field, yoIs a known observed value;
s2, replacing assimilation process by using the first multi-layer sensor, analyzing and obtaining data according to history and future
Figure BDA0002822358760000043
Wherein
Figure BDA0002822358760000044
An analysis field representing the ct instant obtained using the first MLP;
s3, comparing the previous training data set with the previous training data set
Figure BDA0002822358760000045
Superposing and training the next multilayer perceptron to obtain
Figure BDA0002822358760000046
Corresponding to the simulation of a prediction model for short-term prediction in a data assimilation period, wherein
Figure BDA0002822358760000047
The background field representing the ct +1 instant obtained using the second MLP; (ii) a
S4, carrying out iterative training, repeating the processes of S2 and S3, and generating a new link in each link before ct
Figure BDA0002822358760000048
Before replacement
Figure BDA0002822358760000049
Form a new training data set until ct is reached
Figure BDA00028223587600000410
Is completely replaced, and the background information in the following training data set is only composed of
Figure BDA00028223587600000411
Is provided, wherein
Figure BDA00028223587600000412
Representing the background field obtained by the conventional data assimilation method 3D-Var before the time of ct, if ct is 6, then when t is ∈ [7,12 ∈]When it is, then
Figure BDA00028223587600000413
Represents the background field obtained by the MLP method,
Figure BDA00028223587600000414
representing the background field at time t obtained using the second MLP.
Preferably, ct > 3.
Preferably, the previous input node in S2 and S3 is a vector obtained by superimposing the background information at the time before ct and the observed value at the time before n; the latter vector is the other vector, which contains the previous training data set and an analysis result.
Preferably, one of the output nodes of S2 and S3 is at time t
Figure BDA00028223587600000415
The other output node being at time t +1
Figure BDA00028223587600000416
Wherein
Figure BDA00028223587600000417
Representing the analysis field at time t obtained using the first MLP.
Preferably, each of the multi-layer sensors in S2 and S3 has 3 hidden layers, and the neuron number of each hidden layer is obtained through experiments.
Preferably, in S2 and S3, the result obtained by using ReLU as an activation function is non-linear.
Preferably, the learning rates with appropriate weight attenuation are defined in each of the multi-layered perceptrons in S2 and S3.
Preferably, the loss functions in S2 and S3 are mean square error MSE.
Compared with the prior art, the invention has the following beneficial effects:
1. compared with the pure 3D-Var (three-dimensional variation data assimilation), the intelligent data assimilation method provided by the invention effectively improves the quality of an analysis field and has low calculation cost.
2. During the iterative training process of the models, the models are continuously updated
Figure BDA0002822358760000051
The background error covariance is implicitly corrected to provide more accurate background information for the next analysis field.
3. Compared with the prior art, the 3D-Var of the Lorenz-63 system is increased by 56.44 percent; for the Lorenz-96 system, the 3D-Var is increased by 22.76%; with respect to computational cost, a major advantage of the present invention is that better assimilation results are obtained within an acceptable execution time. Compared with the empirical data assimilation method, the speed is respectively improved by about 3 times and 80 times.
Drawings
FIG. 1 is a block diagram of the preparation of an optimized partial data set;
FIG. 2 is a schematic diagram of HDA-MLP optimized 3D-Var;
FIG. 3 is a data processing block diagram;
FIG. 4 shows the RMSE of different models of the first part of HDA-MLP in the Lorenz-63 system at different time steps;
FIG. 5 shows the RMSE of different models of the first part of HDA-MLP in the Lorenz-96 system at different time steps;
FIG. 6 is a RMSE plot of 50 experiments in the Lorenz-63 system for various DA methods;
FIG. 7 shows x in Lorenz-63 systemaAnd xbThe RMSE map of (a);
FIG. 8 is a graphical depiction of the trajectory of the state variables in Lorenz-63;
FIG. 9 is a trace schematic of "cold start" in an enlarged view;
FIG. 10 is an enlarged view of a portion of the trace;
FIG. 11 is a RMSE line graph of 50 experiments in a Lorenz-96 system for various DA methods;
FIG. 12 shows x in Lorenz-96 systemaAnd xbRMSE line graph of (a);
FIG. 13 is the average R of the state variables2And a standard differential intent;
FIG. 14 is a schematic diagram of a sensor structure;
FIG. 15 is a schematic diagram of an example of an MLP;
fig. 16 is a schematic diagram of an example of the Lorenz-96 model (E ═ 40);
FIG. 17 is a schematic diagram of a data assimilation analysis cycle.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Experiments have shown that historical data can convey time signatures, which are critical to describing the flow dependence of atmospheric variables.
In addition, 4D-Var introduces an assimilation time window, and the core idea is to adopt a time interval [ t ] in a DA period0,tn]Rather than using only data at a particular time in the 3D-Var. It is based on the short term yoTo correct the initial state at the beginning of the assimilation time window, similar to the reverse propagation. Based on the ideas of Cache-MLP and assimilation time window, an HDA-MLP model is provided, and a 3D-Var system is optimized.
Data preparation structure of training data set of optimized part is shown in FIG. 1, and the assimilation time window is set as [ t ]1,tn]I.e. analysis using the first half of the data, and correction of the time window [ t ] using the second half of the data1,tn]The middle analysis field. Thereafter, the current analysis field is determined by the underlying bulletin
Figure BDA0002822358760000071
And (4) optimizing.
Figure BDA0002822358760000072
Wherein
Figure BDA0002822358760000073
Wherein n (n ═ 1,2, 3.) is the length of the assimilation time window; ct is the middle time of the assimilation time window; [1, p ]]Represents the interval of the first half data, [0, q ]]Equal to the time window t1,tn]The interval of (1); ω and v are weight matrices of x and y, respectively, bias is the intercept term, xbAs a background field, yoFor known observed values, x in the present inventionbAnd yoThe meanings of expressions are consistent, only different expressions are provided, and the method is suitable for different expression scenes, namely the prior knowledge is equivalent to a background field and is also equivalent to an initial guess field; y isoThe same is true.
As shown in fig. 2 (assuming ct>3) And constructing a chain MLP model to optimize the result of the 3D-Var. The following is an explanation of its working principle. In link1, the first step is to replace the assimilation process with the first MLP, analyzed and derived from historical and future data
Figure BDA0002822358760000074
The next step is to sum the previous training data set with
Figure BDA0002822358760000075
Superposing and training the next MLP to obtain
Figure BDA0002822358760000076
This is equivalent to simulating a prediction model over a DA period for short-term prediction. Next, iterative training is performed, the steps being the same as link 1. It is worth noting that in the link before ct, every timeA link can generate a new link
Figure BDA0002822358760000077
Before replacement
Figure BDA0002822358760000078
Form a new training data set until ct is ahead
Figure BDA0002822358760000079
Is completely replaced, and the background information in the following training data set is only composed of
Figure BDA00028223587600000710
Provided is a method. Namely, the MLP model is made to be an independent innovative assimilation system, and the data of the 3D-Var is not depended on, and the 3D-Var only provides the first ct data
Figure BDA00028223587600000711
To "cold start" the proposed model. During the iterative training process of the models, the models are continuously updated
Figure BDA00028223587600000712
The background error covariance is implicitly corrected to provide more accurate background information for the next analysis field.
The optimal configuration of HDA-MLP to 3D-Var is defined by empirical test and is obtained
Figure BDA00028223587600000713
And
Figure BDA00028223587600000714
are similar because their training data sets are nearly identical in number, wherein
Figure BDA0002822358760000081
An analysis field representing the ct instant obtained using the first MLP;
Figure BDA0002822358760000082
the background field representing the ct +1 instant obtained using the second MLP;
Figure BDA0002822358760000088
representing the background field obtained by the conventional data assimilation method 3D-Var before the time of ct, if ct is 6, then when t is ∈ [7,12 ∈]When it is, then
Figure BDA0002822358760000083
Figure BDA0002822358760000089
Representing background fields obtained by the MLP method for distinguishing background fields obtained by the 3D-Var method of conventional data assimilation
Figure BDA0002822358760000084
Figure BDA00028223587600000810
Representing the background field at time t obtained using the second MLP;
Figure BDA0002822358760000085
representing the analysis field at time t obtained using the first MLP.
One link in the model has the following features:
1) the previous input node is a vector which is obtained by superposing the background information of the moment before ct and the observed value of the moment before n; the latter vector is the other vector, which contains the previous training data set and an analysis result;
2) one output node at time t
Figure BDA0002822358760000086
The other output node being at time t +1
Figure BDA0002822358760000087
3) Each MLP has 3 hidden layers, and the neuron number of each hidden layer is obtained through experiments;
4) taking ReLU as an activation function to obtain a nonlinear result;
5) defining a learning rate with an appropriate weight decay in each MLP;
6) the loss function is Mean Square Error (MSE);
examples
1. Building a data set
In general, two classical nonlinear chaotic power systems are used for carrying out an observation system simulation experiment to check the effectiveness of the DA method. In the present invention, the proposed HDA-MLP method was validated using the Lorenz-63 and Lorenz-96 systems.
Assuming that the Lorenz-63 system is perfect and there is no model error, so the model error covariance matrix Q is 0; given an initial state x0=1.508870,y0=-1.537121,z025.46091, and time-integrates the Lorenz-63 model using a fourth-order Runge-Kutta time difference format, time step dtThe analog integration step size is 111000, 0.01. Then, without assimilation, can be passed
Figure BDA0002822358760000091
The formula is integrated to obtain a real field (x)t). In a true atmosphere model, the ground truth value of a state variable generally represents xt. At xtAdding errors in random obedience Gaussian distribution to obtain an approximate observation field (y)o). Furthermore, a linear observation operator h ═ I and an observation error covariance matrix R ═ I are provided, where I is a third order unitary matrix.
Similarly, under the same ICs, the pure 3D-Var DA system completes 111000 analysis cycles, generating the background field (x)b) And analysis field (x)a). In order to achieve the stable state of the Lorenz-63 system and obtain approximate simulated atmosphere, the cold start of the first 1000 integration time steps is needed, the data of the middle 100000 steps is used for preparing a training data set of HDA-MLP, and the data of the last 10000 steps is test data. Fig. 3 is a structure of data preparation.
Furthermore, to avoid the chance and randomness of the experiment, the same initial state, x ', is set for the 3D-Var DA system'0=1.508870,y′0=-1.537121,z′025.46091, which is the 101001 th chaotic state obtained by 101000 step integration with the initial state mentioned in the data preparation process. 10000 cycles of analysis were then continued and repeated 50 times in preparation for testing the data set.
In the Lorenz-96 system, the steps and procedures for generating simulation data are the same as in the Lorenz-63 system except that the number of state variables is increased from 3 to 40 in the Lorenz-63 system, and the corresponding initial states are different.
2. Evaluation index
In order to measure the assimilation and prediction performance of the model, the effect of the novel DA method was evaluated using Root Mean Square Error (RMSE) and Goodness of Fit (Goodness of Fit).
RMSE denotes the analysis field xaWith the true value xtSample standard deviation of the difference, which is xaAnd xtThe square root of the ratio of the deviation between and the observed number N. RMSE is always non-negative and a value of 0 indicates that the target data fits perfectly. When x isaAnd target xtWhen there is a deviation between them, an analysis error occurs. The role of RMSE is to combine the errors at different times into one metric with analytical capabilities. Generally, a lower RMSE is better than a higher RMSE. Therefore, RMSE is widely used in the DA-related research literature to compare the performance of different DA processes. The lower the RMSE, the better the performance of the DA process; otherwise, the larger the analytical error of the DA method. In the present invention, RMSE is defined as:
Figure BDA0002822358760000101
where i is the ith state variable and e is the e-th set member.
Goodness of fit refers to the degree of fit of the regression line to the observed value. The statistic for measuring the goodness of fit is the determinable coefficient R2,R2Has a maximum value of 1, R2The closer the value is to 1, the better the fit of the regression line to the observed value. R2Has a maximum value of 1, R2The closer the value of (1) is, the better the fitting degree of the regression line and the observed value is; otherwise, R2The smaller the value of (a), the worse the fitting degree of the regression line to the observed value.
Figure BDA0002822358760000102
Figure BDA0002822358760000103
Figure BDA0002822358760000104
Wherein the Total Square Sum (TSS) calculates the total square sum of the samples; residual Sum of Squares (RSS) computes the sum of squares of the residuals, i.e. the sum of squares of the errors.
3. Selection of time step
From the DA period (fig. 3), it can be seen that,
Figure BDA0002822358760000105
from time t
Figure BDA0002822358760000106
And
Figure BDA0002822358760000109
the coupling is calculated directly, but in practice,
Figure BDA0002822358760000107
is to be
Figure BDA0002822358760000108
Short-term predictions obtained as ICs of the prediction model, i.e.
Figure BDA0002822358760000111
And
Figure BDA0002822358760000112
in the context of a correlation, the correlation,
Figure BDA0002822358760000113
is dependent on
Figure BDA0002822358760000114
And
Figure BDA0002822358760000115
in theory, these correlations are also true for longer periods of the DA cycle. The present invention therefore defines the problem as a time series problem that relies on historical meteorological data, including observation fields and ambient fields. It is common to solve the time series problem with the ML method, but the time step (timeout) is an important parameter for the model to learn the time information. In general, the longer the time step, the more information in the past and future can be used for prediction. But there is a key question whether the accumulation of errors will be larger than the accumulation of valid information. Therefore, the invention makes preliminary experiments to select the optimal time step for subsequent research. As shown in FIGS. 4 and 5, in both Lorenz-63 and Lorenz-96 systems, timekeep 6+11 is better than others, where 6 represents the history y of the first 6 state variablesoAnd xbIt provides past information, then 5 future yoAdapted to correct the current xa. A total of 6 xbAnd 11 yoTogether forming a training data set, input data for a neural network to obtain xa. the construction principle of the timer 4+7 and 8+15 is the same as that of the timer 6+11, but the former has less effective information and the latter has more accumulated errors compared with the timer 6+11, which is disadvantageous to the following research. The invention selects timekeep to be 6+11 to construct the training data set of the proposed model.
4. Analysis of experiments
In this section, to prove the validity and effectiveness of the proposed model, the Lorenz-63 system is taken as an example first, which is a low-dimensional simple chaotic system. The method comprises the steps of optimizing analysis fields of a traditional DA method and a 3D-Var by using HDA-MLP, comparing results of existing pure 3D-Var and Cache-MLP methods, analyzing whether the optimization aims can be achieved or not, drawing a regression curve, calculating goodness of fit, investigating the capability of each method for tracking a real field of a model, and evaluating the superiority of the proposed model.
4.1 analysis of the results of the experiments in the Lorenz-63 System
For HDA-MLP, first, FIG. 6 compares the 50 analytical fields of the Lorenz-63 system calculated by different DA methods, including pure 3D-Var, Cache-MLP and HDA-MLP. In the figure, the solid line marked with a circle represents pure 3D-Var; the Cache-MLP is represented by a solid line marked by a triangle; solid lines with square marks represent HDA-MLP. Table 1 reports the average RMSE of 50 experiments per method (bold indicates best results). By comparison, the minimum RMSE of 0.2105 can be achieved by HDA-MLP. Although Cache-MLP improved the accuracy of 3D-Var by 39.85%, it could even be 55.26% higher using HDA-MLP. This is possible because the HDA-MLP can use a short-term future training data set to extract more temporal features and information.
Table 1: comparison of the results of the experiment of FIG. 6
Method RMSE Improvement
Pure 3D-Var 0.4705 /
Cache-MLP 0.2830 39.85%
HDA-MLP 0.2049 56.44%
Furthermore, it was found thatoNext, x obtained by HDA-MLPbImprovement of precision and xaSimilarly (table 2). This is because HDA-MLP not only uses MLP to optimize the assimilation process, but also uses another MLP to simulate the pattern prediction process. The MLP training data set used for simulation model prediction contains historical data for many moments and a more accurate analysis field. In the chain iterative training process, more time information can be mined and B can be updated implicitly, which helps to obtain a useful simulator. Experimental results show that the trained model can quickly and efficiently optimize model prediction (fig. 7).
Table 2: comparison of the results of the experiment of FIG. 7
Figure BDA0002822358760000121
In addition, the best method for judging whether the DA method can well track the true value of the model is to draw a tracking track. As shown in fig. 8, a rough qualitative analysis shows that each DA method can track true values for all model three state variables (x, y, z). Wherein the solid line is model xtDotted line is x of pure 3D-VaraAnd the dotted line segment is x of Cache-MLPaThe dotted line combined with the dotted line is x of HDA-MLPa. Table 3 shows that all methods fit better than 0.99 in each state variable, quantitatively indicating that each method fits the true value of the model well (bold represents the best results). However, in HDA-MLP, R2Slightly larger in each state variable, which means thatThe model performance is better. Some partial sequences of the traces are enlarged and may reflect the details of fig. 8. As shown in fig. 9, the "cold start" period is an important indicator of whether the DA process can bring the assimilation system into steady state quickly. It can be seen that pure 3D-Var and Cache-MLP perform better than HDA-MLP because the former relies on a seemingly better static B, which is computed once from a large amount of meteorological data, while HDA-MLP starts with static B but updates it implicitly to accommodate future assimilation, thus requiring a longer adaptation phase. Furthermore, in fig. 10, all DA methods performed well in monotonically increasing or decreasing trace portions (fig. 10 (a) and fig. 10 (f)), which also yields R2Verification of specific values, in Table 4, R2Is much higher than 0.99 (columns a and f). However, some corners of the trace are seen (fig. 10 (b) and (e)), which represent abrupt changes in the state variables, to which pure 3D-Var cannot adapt, and Cache-MLP performance is slightly better, but with obvious drawbacks. In contrast, HDA-MLP was able to capture changes and obtain the highest R in most cases2(bold in Table 4). The result shows that the proposed model can learn the change of the state variable from the time series training data set, and has better robustness than a pure 3D-Var and a numerical model which are based on ideal assumption and describe the change of the state variable by integration.
Table 3: goodness of fit of various method state variables
Figure BDA0002822358760000131
Table 4: goodness of fit of partial trajectories for different methods
Figure BDA0002822358760000141
4.2 analysis of the results of the experiments in the Lorenz-96 System
For HDA-MLP, FIG. 11 shows a comparison of the results of 50 experiments between the various DA methods of Lorenz-96 (notes scale)Note the same as Lorenz-63 system). Pure 3D-Var is represented by a solid line marked with a circle; Cache-MLP is represented by solid lines marked with triangles; HDA-MLP is represented by the square-marked solid line). The models can improve assimilation accuracy with lower RMSE. However, the degree of performance improvement is relatively small, as shown in table 5, which may be caused by the complexity of the Lorenz-96 system itself. As shown in FIG. 17, the Lorenz-96 model includes 40 state variables (x) around the earth1,x2,...,x40) More spatial features and therefore more spatial dependencies between state variables than the Lorenz-63 system. Only the training data collected in a longer time step is considered to obtain more time information, and the construction of a better model for extracting the spatial features between variables is omitted. This may be the possible reason why the proposed model only slightly improves accuracy. In the present invention, the proposed model, as shown in fig. 12, uses time information to simulate and optimize the assimilation and model prediction processes of the Lorenz 96 system, which has proven to be effective. Also, HDA-MLP achieved a considerable improvement in the background field of 3D-Var, with an improvement of 28.61% (Table 6). More accurate xbAnd the method greatly contributes to optimizing an analysis field in the next assimilation. However, the improvement of the Lorenz-96 system is weaker than the Lorenz-63 system due to the accumulation of errors for more state variables.
Table 5: comparison of the results of the experiment of FIG. 11
Method RMSE Improvement
Pure 3D-Var 0.5283 /
Cache-MLP 0.4283 18.93%
HDA-MLP 0.4080 22.76%
Table 6: comparison of the results of the experiment of FIG. 12
Figure BDA0002822358760000151
Furthermore, the average RMSE may reflect the assimilation ability of various DA methods for all state variables in the Lorenz-96 system, but R2But can show the degree of fit of each state variable, so the average R of all 40 state variables is calculated2And standard deviation (STD) and plotted in FIG. 13 (dotted labeled pure 3D-Var, circled on straight labeled HAD-MLP, cross labeled Cache-MLP). Intuitively, better R can be obtained by HDA-MLP2R of each state variable2Are all close to 1. That is, the HDA-MLP can better fit each state variable. Meanwhile, the STD of HDA-MLP is smaller compared with other methods (except static pure 3D-Var), which shows that the result of HDA-MLP is more concentrated near the average value and the model is more stable. The proposed model is trained by big data containing a large amount of state variable changes, and the neural network can learn the complex relation and time characteristics among the state variables, so that the HDA-MLP can adapt to more test data sets. In general, compared with pure 3D-Var and Cache-MLP, the proposed model can reach the highest R on the premise of ensuring the stability of the DA system2The state variables can be fitted well.
5. Calculating the cost
DA is the process of the assimilation and long-short term prediction iterative loop. It is important to improve the quality of DA. However, the speed of the DA process is also an essential factor in determining the utility and effectiveness of the hybrid DA process. The ultimate goal of the DA is to quickly and accurately obtain the best ICs for NWP. The computing device used in all experiments of the invention was configured as GEFORCE RTX 2060, Intel CORE i 7. Table 7 below shows the calculated costs for each mixing method under different models.
Table 7: computational cost of various hybrid methods to different models
Figure BDA0002822358760000161
Previous hybrid approaches Cache-MLP and backpacked-MLP take the least time because it is only a forward propagation process after model training is complete. However, it relies on the pure 3D-Var background field, i.e. during the test, it is necessary to wait for the pure 3D-Var to complete the short-term prediction to obtain the background field for further test experiments, which is not added in Table 7. However, the HDA-MLP used to optimize the 3D-Var results only requires the first ct (a certain time step) background field to be "cold started". Then, it obtains background fields through an iterative loop, and then predicts the analysis field of the next time step using these background fields. In general, the novel model of (a) can yield higher quality results with an acceptable loss of time in a given integration step.
Inspired by the concept of four-dimensional variational data assimilation, the invention provides an innovative DA method named HDA-MLP. In the present invention, the DA can be defined as a time series problem, relying on historical weather data comprising the observed field and the first guess field. Furthermore, as with 4D-Var, to obtain a more robust hybrid DA system, a short term prediction is added to the HDA-MLP to correct the results at the current time. Under the guidance of the time characteristics of the state variables, HDA-MLP does optimize the analysis field of the conventional DA system in two classical nonlinear dynamical models by implicitly updating the background error covariance. Specifically, for the Lorenz-63 system, the 3D-Var increased by 56.44%; for the Lorenz-96 system, the 3D-Var increased by 22.76%.
The principle of the invention is as follows:
1. three-dimensional variation data assimilation (3D-Var)
Based on Bayes' theorem, the variational approach has been the dominant business approach to NWP in the past 90 s of the 20 th century. Typically, the 3D-Var finds the analytic field by estimating a single initial state of the NWP model by minimizing a cost function L. The cost function L is expressed as:
Figure BDA0002822358760000171
wherein x represents a state variable; x is the number ofbA priori knowledge (also known as background fields) representing state variables; y isoIs a known observed value; h is an observation operator, h (x) is intended to calculate the values in the observation points using the model variables in the grid points; b is a background error covariance matrix, and a background field x is derived from known laws of atmospheric dynamics and physicsbCorrelation with model grid points; r is an observation error covariance matrix which can be obtained from the observation error of the instrument to describe the observed value yoAnd the correlation between the model grid points, and T represents the transposition operation of the matrix.
The cost function L is introduced as follows:
Lbfor measuring x and xbEven if x fits x based on the background error covariance Bb
LoFor measuring x and xoEven x fits y based on the observation error covariance RoWhere h is the observation operator, which performs the necessary interpolation and converts the mode variables into observed spatial field values.
The minimum value of formula (1), i.e. the jacobian vector formula satisfying formula (2), is obtained by the nonlinear conjugate gradient method:
Figure BDA0002822358760000173
wherein H is the observation operator HA tangency operator defined as:
Figure BDA0002822358760000172
according to formula (2), xaIs the state that minimizes the function L and is solved by equation (4):
[B-1+HTR-1h]xa=B-1xb+HTR-1yo (4)
the optimal analysis field is then:
xa=[B-1+HTR-1h]-1[B-1xb+HTR-1yo]
=[B-1+HTR-1h]-1[B-1xb+HTR-1h(xb)-HTR-1h(xb)+HTR-1yo] (5)
=xb+[B-1+HTR-1h]-1HTR-1(yo-h(xb))
wherein, h (x)b) Perform the necessary interpolation and change the mode variable xbConversion into values of a corresponding observed spatial field
Minimizing the cost function L can be regarded as an inverse problem, since the construction and inversion operation of the error covariance matrix B, R are difficult points of three-dimensional variation and assimilation, and the inversion of the high-dimensional matrix B, R is very difficult, and requires a large amount of computing resources and storage space, when actually solving this problem, the equation (5) is not directly solved, but gradient information of the objective function is obtained through proper transformation, and then an optimization method (such as newton method, quasi-newton method, BFGS method, LBFGS method, etc.) is adopted to iteratively solve the approximate minimum value of the equation (5).
2. Multilayer perceptron (MLP)
On the basis of simulating human brain, F.Rosenblatt simply abstracts biological nerve cells, and designs a first computer neural network, namely a perception engine. The perceptron is a linear classifier, namely, a linearly separable data set, and a better linear equation can be fit by the perceptron in a learning mode. However, the perceptron model expression capability of single neurons is not sufficient to solve more complex problems than simple linear regression. The multi-layer perceptron has a fully connected structure, and each node except the input layer can be regarded as a processing unit with a non-linear activation function. If each activation function is a linear function, each node of the MLP in any layer can be reduced to a perceptron. Namely, MLP is a classical deep neural network, which is a general class of approximators. That is, providing enough hidden units, an MLP with at least one hidden layer can approximate any measurable function f (x) using an appropriate activation function.
An example of the construction of the sensor and MLP is shown in fig. 14 and 15.
3. Classical nonlinear dynamic model
3.1 Lorenz-63 System-Low dimensional chaos model
The Lorenz-63 system is a simplified mathematical model of atmospheric convection. The nonlinear Lorenz three-variable model system has the advantages of simple calculation and strong nonlinear interaction between variables. The state evolution in the Lorenz-63 system can be given by the following three general differential equations:
Figure BDA0002822358760000181
where the dots represent the derivative with respect to time. And integrating the model by using the typical values of the parameters sigma, beta and rho which are respectively 10, 8.3 and 28 and are described in the classical Lorenz paper to obtain the chaotic power system.
The equation relates to the behavior of a two-dimensional fluid layer that heats up uniformly from the bottom up and cools down uniformly from the top down. In particular, the equation describes the rate of change of three quantities with time: x is proportional to the convection velocity, y is proportional to the horizontal temperature change, and z is proportional to the vertical temperature change.
3.2 Lorenz-96 System-high-dimensional chaos model
Lorenz-96 is a more complex nonlinear kinetic system than Lorenze-63. Its function can be defined as equation (7). Wherein, the first term on the right is an advection term, and the second term is a damping term. Lorenz-96 is a periodic system that is commonly used to model the time evolution of atmospheric variables.
Figure BDA0002822358760000191
Where J is the number of state variables. Fig. 16 shows a connection line of the true states of 40 state variables of the Lorenz-96 model at a given time when J is 40, the state variables being uniformly distributed on the equator of the earth, F representing an externally forced constant set to 8, which is a known general value that forms chaotic behavior.
4. Problem definition
In business systems, short-term forecasts are used as initial guess fields (x)b) In combination with the new observation (y)o) To obtain an analysis field (x)a) Referred to as a data assimilation analysis cycle (see FIG. 17). Wherein, yoAfter real-time measurement of different devices, the measurement result is obtained by inversion of an empirical function. The prediction model is based on x at the current momentaShort term prediction for IC, giving a new xb. Thus, a complete analysis cycle consists of assimilation and model prediction, which are repeated.
In 3D-Var, the goal is to find the best weights for two sources within a single time t, as in equation (8).
Figure BDA0002822358760000192
Based on the first order Markov assumption, a traditional DA can be defined as a regression problem that relies only on variables at a single time. However, current xaThis redefines the DA as a time series problem, subject to multiple time instants of historical data. The inventionExhibits and considers future short-term predictions xbIs derived from the previous xaEvolved so that the future short-term y which can be measured independently can be utilized according to the dynamic property of the DAoAnd x corrected at the same timebTo correct the current xa. An obvious difficulty with the model is finding a way to integrate historical xbAnd yoAnd future yoTo obtain more accurate x for the new HDA-MLP modela. Thus, the assimilation process in a single DA method can be defined as a regression problem based on multiple time-varying variables, and the model can be represented by equation (9).
xa=f(historical(xb,yo),future(yo)) (9)
Where f is the DA model learned from the training data set.
In addition, x is obtainedaThe next process for DA is then short-term prediction by the predictive model. Those predictive models are known to be mathematical partial differential equations of semi-empirical nature and can also be considered as a regression problem. However, solving partial differential equations using mathematical physics is difficult and time consuming. In order to speed up the process of model prediction, a generic approximator MLP is used to simulate and replace the prediction model.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (8)

1. An intelligent data assimilation method based on a variation assimilation framework is characterized by comprising the following steps:
s1, analyzing the first half of the data of the assimilation window, and correcting the assimilation time window [ t ] with the second half of the data1,tn]Intermediate analytical field of formula
Figure FDA0002822358750000011
Determining the analysis field of the current time
Figure FDA0002822358750000012
In which
Figure FDA0002822358750000013
q-n-1, n (n-1, 2, 3.) is the length of the assimilation time window; ct is the middle time of the assimilation time window; [1, p ]]Represents the interval of the first half data, [0, q ]]Equal to the time window t1,tn]The interval of (1); ω and v are weight matrices of x and y, respectively, bias is the intercept term, xbAs a background field, yoIs a known observed value;
s2, replacing assimilation process by using the first multi-layer sensor, analyzing and obtaining data according to history and future
Figure FDA0002822358750000014
Wherein
Figure FDA0002822358750000015
An analysis field representing the ct instant obtained using the first MLP;
s3, comparing the previous training data set with the previous training data set
Figure FDA0002822358750000016
Superposing and training the next multilayer perceptron to obtain
Figure FDA0002822358750000017
Corresponding to the simulation of a prediction model for short-term prediction in a data assimilation period, wherein
Figure FDA0002822358750000018
The background field representing the ct +1 instant obtained using the second MLP;
s4, carrying out iterative training, and repeating S2In the process of S3, each link before ct generates a new link
Figure FDA0002822358750000019
Before replacement
Figure FDA00028223587500000110
Form a new training data set until ct is ahead
Figure FDA00028223587500000111
Is completely replaced, and the background information in the following training data set is only composed of
Figure FDA00028223587500000112
Is provided, wherein
Figure FDA00028223587500000113
Representing the background field obtained by the conventional data assimilation method 3D-Var before the time of ct, if ct is 6, then when t is ∈ [7,12 ∈]When it is, then
Figure FDA00028223587500000114
Figure FDA00028223587500000115
Represents the background field obtained by the MLP method,
Figure FDA00028223587500000116
representing the background field at time t obtained using the second MLP.
2. The method of claim 1, wherein ct > 3.
3. The method of claim 1, wherein the previous input node of S2 and S3 is a vector obtained by superimposing the background information of time before ct and the observed value of time n before ct; the latter vector is the other vector, which contains the previous training data set and an analysis result.
4. The method of claim 1, wherein one of the output nodes of S2 and S3 is at time t
Figure FDA0002822358750000021
The other output node being at time t +1
Figure FDA0002822358750000022
Wherein
Figure FDA0002822358750000023
Representing the analysis field at time t obtained using the first MLP.
5. The method of claim 1 wherein each of said S2 and S3 has 3 hidden layers, and the number of neurons in each hidden layer is experimentally obtained.
6. The method of claim 1 wherein the results of S2 and S3 are nonlinear with ReLU as the activation function.
7. The method of claim 1 wherein each of said S2 and S3 defines a learning rate with appropriate weight decay in each of said multi-level sensors.
8. The method of claim 1 wherein the loss function in S2 and S3 is Mean Square Error (MSE).
CN202011420985.5A 2020-12-08 2020-12-08 Intelligent data assimilation method based on variational and assimilative framework Active CN112464567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011420985.5A CN112464567B (en) 2020-12-08 2020-12-08 Intelligent data assimilation method based on variational and assimilative framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011420985.5A CN112464567B (en) 2020-12-08 2020-12-08 Intelligent data assimilation method based on variational and assimilative framework

Publications (2)

Publication Number Publication Date
CN112464567A true CN112464567A (en) 2021-03-09
CN112464567B CN112464567B (en) 2022-08-12

Family

ID=74801674

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011420985.5A Active CN112464567B (en) 2020-12-08 2020-12-08 Intelligent data assimilation method based on variational and assimilative framework

Country Status (1)

Country Link
CN (1) CN112464567B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113609790A (en) * 2021-10-11 2021-11-05 成都数联云算科技有限公司 Product virtual measuring method, system, device and medium
CN115630566A (en) * 2022-09-28 2023-01-20 中国人民解放军国防科技大学 Data assimilation method and system based on deep learning and dynamic constraint

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101471378B1 (en) * 2013-11-18 2014-12-10 재단법인 한국형수치예보모델개발사업단 Spectral transformation method for three-dimension variational data assimilation of numerical weather prediction model using cubed-sphere grid based on spectral element method and hardware device performing the same
US20170338802A1 (en) * 2014-12-01 2017-11-23 Harbin Engineering University Actually-measured marine environment data assimilation method based on sequence recursive filtering three-dimensional variation
CN110211325A (en) * 2019-06-27 2019-09-06 上海同望信息技术有限公司 A kind of area road icing high precision monitor early warning system based on meteorological big data
CN111783361A (en) * 2020-07-07 2020-10-16 中国人民解放军国防科技大学 Numerical weather forecast mixed data assimilation method based on triple multi-layer perceptron

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101471378B1 (en) * 2013-11-18 2014-12-10 재단법인 한국형수치예보모델개발사업단 Spectral transformation method for three-dimension variational data assimilation of numerical weather prediction model using cubed-sphere grid based on spectral element method and hardware device performing the same
US20170338802A1 (en) * 2014-12-01 2017-11-23 Harbin Engineering University Actually-measured marine environment data assimilation method based on sequence recursive filtering three-dimensional variation
CN110211325A (en) * 2019-06-27 2019-09-06 上海同望信息技术有限公司 A kind of area road icing high precision monitor early warning system based on meteorological big data
CN111783361A (en) * 2020-07-07 2020-10-16 中国人民解放军国防科技大学 Numerical weather forecast mixed data assimilation method based on triple multi-layer perceptron

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113609790A (en) * 2021-10-11 2021-11-05 成都数联云算科技有限公司 Product virtual measuring method, system, device and medium
CN113609790B (en) * 2021-10-11 2021-12-03 成都数联云算科技有限公司 Product virtual measuring method, system, device and medium
CN115630566A (en) * 2022-09-28 2023-01-20 中国人民解放军国防科技大学 Data assimilation method and system based on deep learning and dynamic constraint

Also Published As

Publication number Publication date
CN112464567B (en) 2022-08-12

Similar Documents

Publication Publication Date Title
CN109492822B (en) Air pollutant concentration time-space domain correlation prediction method
CN105205313B (en) Fuzzy Gaussian sum particle filtering method and device and target tracking method and device
CN111461453B (en) Medium-and-long-term runoff ensemble forecasting method based on multi-model combination
CN110689179A (en) Water bloom prediction method based on space-time sequence mixed model
CN111783361B (en) Numerical weather forecast mixed data assimilation method based on triple multi-layer perceptron
CN112464567B (en) Intelligent data assimilation method based on variational and assimilative framework
CN109635245A (en) A kind of robust width learning system
CN113269314B (en) New energy power generation scene data migration method based on generation countermeasure network
CN115935834A (en) History fitting method based on deep autoregressive network and continuous learning strategy
Huang et al. A data-driven method for hybrid data assimilation with multilayer perceptron
CN116187835A (en) Data-driven-based method and system for estimating theoretical line loss interval of transformer area
CN112989711B (en) Aureomycin fermentation process soft measurement modeling method based on semi-supervised ensemble learning
CN112381282B (en) Photovoltaic power generation power prediction method based on width learning system
CN115952685B (en) Sewage treatment process soft measurement modeling method based on integrated deep learning
CN116502539B (en) VOCs gas concentration prediction method and system
CN108876038A (en) Big data, artificial intelligence, the Optimization of Material Property method of supercomputer collaboration
Springer et al. Robust parameter estimation of chaotic systems
CN110852415B (en) Vegetation index prediction method, system and equipment based on neural network algorithm
CN115796327A (en) Wind power interval prediction method based on VMD (vertical vector decomposition) and IWOA-F-GRU (empirical mode decomposition) -based models
CN115018137A (en) Water environment model parameter calibration method based on reinforcement learning
CN115545159A (en) Average sea surface temperature forecasting method of deep neural network
CN112581311B (en) Method and system for predicting long-term output fluctuation characteristics of aggregated multiple wind power plants
CN115270239A (en) Bridge reliability prediction method based on dynamic characteristics and intelligent algorithm response surface method
Silva et al. GAN for time series prediction, data assimilation and uncertainty quantification
Xiao et al. FengWu-4DVar: Coupling the Data-driven Weather Forecasting Model with 4D Variational Assimilation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant