CN111414991B - Meteorological frontal surface automatic identification method based on multiple regression - Google Patents

Meteorological frontal surface automatic identification method based on multiple regression Download PDF

Info

Publication number
CN111414991B
CN111414991B CN202010106401.0A CN202010106401A CN111414991B CN 111414991 B CN111414991 B CN 111414991B CN 202010106401 A CN202010106401 A CN 202010106401A CN 111414991 B CN111414991 B CN 111414991B
Authority
CN
China
Prior art keywords
temperature
multiple regression
training
gradient
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010106401.0A
Other languages
Chinese (zh)
Other versions
CN111414991A (en
Inventor
李骞
丁新亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202010106401.0A priority Critical patent/CN111414991B/en
Publication of CN111414991A publication Critical patent/CN111414991A/en
Application granted granted Critical
Publication of CN111414991B publication Critical patent/CN111414991B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01WMETEOROLOGY
    • G01W1/00Meteorology
    • G01W1/10Devices for predicting weather conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Operations Research (AREA)
  • Health & Medical Sciences (AREA)
  • Algebra (AREA)
  • Computing Systems (AREA)
  • Environmental & Geological Engineering (AREA)
  • Atmospheric Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Ecology (AREA)
  • Environmental Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

The invention provides a weather front automatic identification method based on multiple regression, which comprises the following steps: determining a constant temperature zone; removing a constant temperature zone; after the meteorological element data set removes the constant temperature zone, calculating each diagnosis quantity and generating a training sample set; training multiple regression equation coefficients (weights at the time of identifying fronts of various diagnostic quantities) by utilizing a multiple regression neural network (Multiple Regression Neural Network, MRNN) model and a training sample set; substituting the meteorological element data set and the multiple regression model coefficient obtained through training into a multiple regression model, and automatically identifying the atmospheric frontal probability of each grid point on line.

Description

Meteorological frontal surface automatic identification method based on multiple regression
Technical Field
The invention belongs to the technical field of ground meteorological observation in atmospheric detection, and particularly relates to an automatic meteorological frontal surface identification method based on multiple regression.
Background
Atmospheric science defines a narrow inclined zone of significant variation in thermodynamic and wind fields as a frontal surface, and more specifically understands a cold-side inclined interface or transition region created by the meeting of two air masses of different cooling and heating properties in the region of air flow junction. The intersection line of the front with the ground is called the front in the transition zone, which is a less regular inclined surface, with the heating mass above and the cooling mass below. Since cold air is heavier than warm air, their interface area is an angled interface area.
Because the atmospheric pressure near the frontal surface is large, the development and energy conversion of vertical circulation are facilitated, and more severe weather changes and the generation and development of a pneumatic system are usually carried out near the frontal surface, for example, after the frontal surface passes through the environment, a cold air cluster occupies the original position of a heating cluster, the air temperature is reduced, the air pressure is increased, and the weather is more fine; after the heating front passes the border, the heating block occupies the original position of the cold block, the air temperature rises, the air pressure drops, and the weather changes into cloudy and rainy weather. Therefore, the area near the frontal surface system often accompanies the phenomenon of strong rainfall, and the positioning and analysis of the frontal surface are helpful for a forecaster to grasp the position, the intensity and the evolution process of the frontal surface system, so that the change trend of the frontal surface system and the weather is accurately predicted, and the method has important significance for weather forecast.
At present, the frontal surface analysis still mainly depends on manual analysis, forecasters comprehensively analyze weather elements such as temperature, air pressure field, wind field, humidity field, sky condition, visibility, weather and the like according to the basic principles of weather science and dynamic weather science and by combining the principles of historical continuity, space coordination and the like, judge the nature, position and development trend of the frontal surface, and manually draw or interactively generate vector graphics by means of the existing application software (such as a national weather service MICAPS system and the like). However, manual analysis is time-consuming and labor-consuming, forecast production and release time is prolonged, and subjective experience and judgment of people are easy to lead in errors, leaks, inconsistent results and the like. The automatic identification of the frontal surface not only can reduce the consumption of a large amount of manpower and time, but also has important significance for the analysis and application of massive historical meteorological data.
At present, weather researchers in various countries also conduct researches on automatic identification of fronts, but most of the researches are based on single weather elements, so that the reliability is poor; even if frontal surface identification research is carried out based on multiple elements, the problem of conflict processing of the multiple element diagnosis results is not considered, and uncertainties existing in data are difficult to solve, so that robustness is weak; at present, the intelligent level of automatic frontal surface identification is generally low, and the automatic frontal surface identification cannot be effectively carried out by combining manual analysis experience.
Disclosure of Invention
The invention aims to: the invention aims to solve the technical problem that the accuracy of the existing automatic identification method of the frontal surface by utilizing single diagnosis quantity is low, and provides the automatic identification method of the weather frontal surface based on multiple regression, thereby realizing the automatic identification of the atmospheric frontal surface, and specifically comprising the following steps:
step 1, determining grid points of a constant temperature zone by using a statistical method;
step 2, removing grid points of a constant temperature zone in each meteorological element data set;
step 3, calculating a multiple diagnosis quantity: calculating a thermodynamic front parameter (TFP, the product of the magnitude of a thermodynamic parameter gradient and a unit vector of the thermodynamic parameter gradient direction) diagnosis quantity of each grid point by using a thermodynamic factor, calculating a dew point diagnosis quantity G (Ld) by using a temperature and dew point temperature difference, calculating an air pressure diagnosis quantity G (P) by using an air pressure field, and calculating a vorticity diagnosis quantity zeta by using a wind field;
and 4, training multiple regression equation coefficients by using a neural network: inputting a meteorological element data set of each grid point, removing the grid points in a constant temperature zone, preprocessing the meteorological element data set, inputting an MRNN network structure, initializing network training parameters, and finally training by utilizing a training set (each meteorological element data and artificial frontal surface data) through the MRNN network to obtain coefficients of a multiple regression equation, wherein the coefficients of the multiple regression equation are weights of each meteorological diagnosis quantity when the frontal surface is identified;
step 5, identifying a frontal surface through a regression equation:
α*TFP+β*G Ld +γ*G P +δ*ζ+B=ε,
inputting the meteorological element data set of each grid point and the multiple regression equation coefficient obtained in the step 4 into a multiple regression model, and outputting a frontal surface identification result.
Step 1 comprises the following steps:
step 1-1, solving a temperature gradient: the temperature data of the grid points within one year are randomly selected, and the temperature gradient of each grid point is calculated by substituting the following formula:
Figure BDA0002388593940000021
wherein X represents latitude, Y represents longitude, T represents temperature, G T Representing a temperature gradient;
step 1-2, determining a temperature gradient threshold value: randomly selecting artificial frontal surface data within one year, solving a temperature gradient mean value of grid points in the artificial frontal surface data, and setting the solved temperature gradient mean value as a temperature gradient threshold value, wherein the temperature gradient threshold value is used for determining the frequency of each grid point;
step 1-3, counting frequency: counting the frequency of the temperature gradient of each grid point being greater than a temperature gradient threshold value;
step 1-4, determining grid points of a constant temperature zone: when the grid point temperature gradient frequency exceeds the set frequency threshold value (the method is set to 0.3), the grid point is set as the grid point of the constant temperature zone.
The steps 1-4 comprise the following steps:
step 1-4-1, determining a frequency threshold: determining a frequency value when the temperature gradient of each grid point of the sea-land junction and the junction of the mountain and the plain is smaller than the temperature gradient threshold value determined in the step 1-2 as a frequency threshold value, wherein the frequency threshold value of the temperature gradient used in the method is 0.3;
step 1-4-2, judging grid points of a constant temperature zone: the constant temperature zone is a region with great temperature change all the year round, such as a region of sea-land junction and mountain and plain junction, and is easy to cause misjudgment of frontal surface. When the frequency of the grid point temperature gradient obtained by dividing the number of times that the grid point temperature gradient is larger than the temperature gradient threshold value in one year by the counted total number of times is larger than the frequency threshold value determined in the step 1-4-1, judging that the grid point is a constant temperature zone grid point; otherwise, the temperature is very constant.
The step 2 comprises the following steps: setting the value of the constant temperature zone grid point to 0 according to the constant temperature zone grid points determined in the step 1, and setting the meteorological element data values of other constant temperature zone grid points to 1; the downloaded grid data including a temperature field, an air pressure field, a wind field and the like of the middle weather forecast center (ECMWF) in Europe and the data of the constant temperature zone (the value of the grid point of the very constant temperature zone is set to be 1, the value of the grid point of the constant temperature zone is set to be 0) are subjected to AND operation, and the grid point of the constant temperature zone (the value of the grid point of the constant temperature zone of each meteorological element is set to be 0) is removed, so that the grid point of the constant temperature zone is directly judged to be a non-frontal surface.
Step 3 comprises the following steps:
step 3-1, calculating a thermodynamic front parameter (Thermal Frontal Parameter, TFP, the product of the magnitude of a thermodynamic parameter gradient and a unit vector of the thermodynamic parameter gradient direction, which is actually the gradient for the temperature gradient) TFP diagnostic quantity according to the following formula:
Figure BDA0002388593940000031
wherein ,
Figure BDA0002388593940000032
for gradient operations, τ is the diagnostic factor, and τ is selected in the present method as the pseudo-equivalent temperature θ e Pseudo-equivalent temperature theta e The value is calculated by the following formula:
Figure BDA0002388593940000041
wherein T represents temperature, P represents sea level pressure, P 00 Represents standard atmospheric pressure, P 00 =1000hPa,R d Is the gas constant of dry air, and R d =287.05J/kg/K,C p Constant pressure specific heat of dry air, and C p =1005.7J/kg/K;C l Is the specific heat capacity of liquid water; l represents the latent heat of condensation per unit mass, r T For mixing ratio at T temperature, r s Is a saturated mixing ratio, and r s The calculation formula is that
Figure BDA0002388593940000042
wherein es Is saturated water vapor pressure;
step 3-2, calculating a dew point temperature diagnostic quantity G Ld
Firstly, the temperature and dew point temperature difference is obtained, ld represents the dew point temperature, then the temperature and dew point temperature difference TL=T-Ld, and then the dew point temperature diagnosis quantity is calculated by utilizing a formula:
Figure BDA0002388593940000043
wherein X represents latitude, Y represents longitude, G Ld A gradient indicating a temperature and dew point temperature difference, i.e., a dew point temperature diagnostic quantity;
step 3-3, calculating the air pressure diagnostic quantity G according to the following formula P
Figure BDA0002388593940000044
wherein ,GP Represents the sea level air pressure gradient, P represents the sea level air pressure,
Figure BDA0002388593940000045
representing a small change in the latitudinal direction,
Figure BDA0002388593940000046
representing a small change in the longitudinal direction;
step 3-4, calculating wind field diagnosis quantity ζ:
Figure BDA0002388593940000047
wherein X represents a latitude direction, and Y represents a longitude direction; u, V are the wind speed component of the wind field in the 10 meter longitude direction and the wind speed component in the 10 meter latitude direction, respectively.
Step 4 comprises the steps of:
step 4-1, data preprocessing: for the TFP diagnostic quantity and dew point temperature diagnostic quantity G obtained in the step 3 Ld Diagnostic amount of barometric pressure G P And wind field diagnostic quantity zeta [ -1,1 []Is normalized by the normalization process;
step 4-2, constructing a multiple regression neural network MRNN (Multiple regression neural network, MRNN) for determining each coefficient of the multiple regression equation, wherein the input layer of the MRNN network is 4 inputs which are respectively TFP diagnosis quantity and dew point temperature diagnosis quantity G Ld Diagnostic amount of barometric pressure G P And a wind farm diagnostic amount ζ; hidingThe layers are 4 layers, the connection with the input layer is 1 to 1 connection, and the TFP diagnosis quantity and the dew point temperature diagnosis quantity G Ld Diagnostic amount of barometric pressure G P And the wind field diagnostic quantity zeta has the weights of alpha, beta, gamma and delta respectively; the results of the output layer were obtained by the following multiple regression equation:
α*TFP+β*G Ld +γ*G P +δ*ζ+B=ε
wherein B is bias and epsilon is regression value;
step 4-3, initializing the MRNN network: initializing the MRNN network by using an Xavier initialization method to ensure that coefficients of a multiple regression equation are uniformly distributed in a mode
Figure BDA0002388593940000051
Random initialization is performed within the range of (1), wherein n is the input latitude of the layer where n is the output latitude.
Step 4-4, reading training samples: the method comprises the steps that a batch training mode is adopted, each time of training is performed, a BatchSize group training sample is read from a training sample set (the training sample set is composed of weather element values of different time points of each grid point and artificial frontal surface data values of corresponding time points, wherein each weather element is a temperature field, a pressure field and a wind field, specific values of the weather elements are downloaded in an European middle weather forecast center (ECMWF), and artificial frontal surface data for supervised learning is derived from artificial identification data of an weather table);
step 4-5, training the MRNN network: and determining a loss function of the MRNN network, updating parameters by using a self-adaptive gradient descent optimization method until the training requirement is met, and outputting multiple regression equation coefficients.
The steps 4-5 comprise the following steps:
step 4-5-1, determining a loss function: the loss function is set as loss= |y-train|, wherein Train is artificial frontal data, and y is output data of the MRNN network;
step 4-5-2, determining network parameters: let the network learning rate lambda=0.001, the number of samples input each time during training period batch size=6, and the maximum batch training times of the training sample set
Figure BDA0002388593940000052
The current batch training frequency BatchNum=1, the maximum iteration frequency of network training ItemATONUM=10, and the current iteration frequency ItemATONUM=1;
step 4-5-3, updating coefficients by using an AdaGrad algorithm as an optimization method: using a small random gradient g t Cumulative variable s squared by element t (corresponding to a gradient memory buffer), at time step 0, the gradient memory buffer s at the beginning will be 0 Initializing each element of (2) to 0; at time step t, a small batch of random gradients g are first applied t The sum is added to the variable s after squaring according to the element t
s t ←s t-1 +g t ⊙g t
Wherein, as follows, the learning rate of each element in the argument of the objective function is readjusted by element operation:
Figure BDA0002388593940000061
where η is the learning rate and needs to be set by oneself (generally, 0.01); omega is a constant added to maintain numerical stability, e.g. 10 -6
Step 4-5-4, outputting coefficients: and outputting multiple regression equation coefficients when the training requirement is met.
Step 5 comprises the steps of:
step 5-1, determining a multiple regression equation: each coefficient of the multiple regression equation is obtained through MRNN network training;
step 5-2, determining a frontal surface: inputting the meteorological element data set of each grid point and each coefficient of the multiple regression equation into the multiple regression model:
α*TFP+β*G Ld +γ*G P +δ*ζ+B=ε,
and finally outputting a frontal surface identification result, wherein the value of epsilon represents the probability value of the frontal surface.
The beneficial effects are that: the method has the advantages that various diagnosis quantities are designed, the frontal surface is automatically identified based on the multiple regression equation, the accuracy of automatic identification of the current atmospheric frontal surface is effectively improved, the time of weather analysis is shortened, and timely release of weather forecast is facilitated.
In particular, the invention has the following advantages over existing methods: 1. compared with the manual identification of the frontal surface, the method effectively reduces the consumption of a large amount of manpower and time, and simultaneously reduces the problems of easy error, leakage, inconsistent results and the like of subjective experience and judgment of people; 2. the accuracy is further improved than a single diagnostic quantity to automatically identify the front.
Drawings
The foregoing and/or other advantages of the invention will become more apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings and detailed description.
FIG. 1 is a flow chart of the present invention.
FIG. 2 is a flow chart for determining a temperature gradient threshold when determining a constant temperature zone.
Fig. 3 is a flowchart for determining constant temperature zone grid points.
Fig. 4 is a diagram of a method for removing grid points of a constant temperature zone.
FIG. 5 is a MRNN model for weighting each diagnostic quantity using a multiple regression method.
FIG. 6 is a chart of a diagnostic quantity weight training learning process for a multiple regression method.
FIG. 7 is a graph of the comparison of the single element diagnostic quantity and the fronts identified by the multiple regression method.
Detailed Description
As shown in fig. 1, the invention discloses a weather front automatic identification method based on multiple regression, which comprises the following steps:
step 1, determining a constant temperature zone by using a statistical method: determining whether the temperature gradient of each grid point is a constant temperature zone or not by counting the frequency with large temperature gradient of each grid point;
step 2, removing grid points of a constant temperature zone in each meteorological element data set: the constant temperature zone is a region with great temperature change all the year round, such as a sea-land junction, a mountain and a plain junction, and the constant temperature zone is easy to cause misjudgment of the frontal surface, therefore, grid points of the constant temperature zone must be removed when the frontal surface is automatically identified, the method is that the values of the grid points of the constant temperature zone are set to 0 according to the grid points of the constant temperature zone determined in the step 1, and the meteorological element data values of the grid points of other constant temperature zones are set to 1; the grid data of each meteorological element and the data of the constant temperature zone are subjected to AND operation, so that the constant temperature zone is removed, and the purpose of reducing misjudgment caused by automatic identification of the front surface by the constant temperature zone is achieved.
Step 3, calculating a multiple diagnosis quantity: calculating a Thermodynamic Front Parameter (TFP) diagnosis quantity of each grid point by using a thermodynamic factor, calculating a dew point diagnosis quantity G (Ld) by using a temperature and dew point temperature difference, calculating an air pressure diagnosis quantity G (P) by using an air pressure field, and calculating a vorticity diagnosis quantity zeta by using a wind field;
and 4, training multiple regression equation coefficients by using a network: inputting a meteorological element data set of each grid point, removing the grid points in a constant temperature zone, preprocessing data of the meteorological element data set, inputting an MRNN network structure, initializing network training parameters, and finally training by utilizing a training set (each meteorological element data and artificial frontal surface data) through the MRNN network to obtain coefficients of a multiple regression equation.
Step 5, identifying a frontal surface through a regression equation: inputting the meteorological element data set of each grid point and the multiple regression equation coefficient obtained in the step 4 into a multiple regression model (alpha. TFP+beta. G) Ld +γ*G P +delta =ζ+b=ε), and finally outputting the frontal surface identification result.
Step 1 comprises the following steps:
step 1-1, solving a temperature gradient: randomly selecting temperature data of grid points within one year, substituting a temperature gradient formula into the temperature gradient formula to obtain
Figure BDA0002388593940000081
Obtaining each grid pointA temperature gradient in which X represents a latitude direction, Y represents a longitude direction, T represents a temperature, G T Representing a temperature gradient;
step 1-2, determining a temperature gradient threshold value: a flow chart for determining a temperature gradient threshold when determining constant temperature zone grid points is shown in fig. 2. Firstly, randomly selecting artificial frontal surface data in one year; then, solving a temperature gradient mean value of the artificial frontal grid points; finally, setting the obtained average value as a temperature gradient threshold value, wherein the temperature gradient threshold value is used for determining the frequency of the temperature gradient of each grid point in one year to be greater than the temperature gradient threshold value in the steps 1-3;
step 1-3, frequency statistics: counting the frequency of the temperature gradient value of each grid point in one year, which is greater than the temperature gradient threshold value set in the step 1-2;
step 1-4, determining grid points of a constant temperature zone: when the grid point temperature gradient frequency exceeds the frequency threshold value determined in the step 1-4-1, the grid point is determined to be a constant temperature zone grid point, as shown in a flowchart of determining the constant temperature zone grid point in FIG. 3, firstly, the temperature gradient value of each grid point in one year is obtained; then comparing the temperature gradient value with the temperature gradient threshold value determined in the step 1-2, and counting the frequency that the temperature gradient value of each grid point in one year is larger than the threshold value; and finally, comparing the temperature gradient frequency with the frequency threshold value determined in the step 1-4-1, and determining the grid point as a constant temperature zone grid point when the temperature gradient frequency of the grid point is larger than the frequency threshold value.
The steps 1-4 comprise the following steps:
step 1-4-1, determining a frequency threshold: the frequency threshold set by the method is 0.3;
step 1-4-2, judging grid points of a constant temperature zone: the constant temperature zone is a region with great temperature change all the year round, such as a region of sea-land junction and mountain and plain junction, and is easy to cause misjudgment of frontal surface. When the frequency of the grid point temperature gradient obtained by dividing the number of times that the grid point temperature gradient is larger than the temperature gradient threshold value in one year by the counted total number of times is larger than the frequency threshold value determined in the step 1-4-1, judging that the grid point is a constant temperature zone grid point; otherwise, the temperature is very constant.
Step 3 comprises the following steps:
step 3-1, calculating a value of the TFP diagnostic quantity: the formula of the TFP diagnostic quantity is:
Figure BDA0002388593940000082
in the formula :
Figure BDA0002388593940000091
for gradient operation, τ is the diagnostic factor, and the method τ is the pseudo-phase temperature θ e The value is obtained by the following formula:
Figure BDA0002388593940000092
wherein T represents temperature, P represents sea level pressure, P 00 Represents standard atmospheric pressure, P 00 =1000hPa,R d Is the gas constant of dry air, and R d =287.05J/kg/K,C p Constant pressure specific heat of dry air, and C p =1005.7J/kg/K。C l The specific heat capacity of liquid water is generally 4190J/kg/K. L represents the latent heat of condensation per unit mass, r T For mixing ratio at T temperature, r s Is a saturated mixing ratio, and r s The calculation formula is that
Figure BDA0002388593940000093
wherein es Is saturated water vapor pressure;
step 3-2, calculating a dew point temperature diagnostic quantity G Ld : firstly, the temperature and dew point temperature difference is obtained, T represents the temperature, ld represents the dew point temperature, and the temperature and dew point temperature difference TL=T-Ld; and calculating the dew point temperature diagnosis quantity by using the following formula:
Figure BDA0002388593940000094
wherein X represents latitude, Y represents longitude, G Ld A gradient representing a temperature difference from the dew point temperature;
step 3-3, calculating the barometric diagnostic quantity G P
Figure BDA0002388593940000095
Wherein P represents sea level air pressure, G P Represents the air pressure gradient at sea level,
Figure BDA0002388593940000096
representing a small change in the latitudinal direction,
Figure BDA0002388593940000097
representing small changes in the longitudinal direction, the method selects the air pressure value of the sea level for calculation;
step 3-4, calculating wind field diagnosis quantity ζ: the formula for designing the wind field diagnosis quantity is as follows:
Figure BDA0002388593940000098
wherein X represents a latitude direction, and Y represents a longitude direction; u, V are the wind speed components of the wind field in the 10 meter longitude and 10 meter latitude directions, respectively.
Step 4 comprises the steps of:
step 4-1, data preprocessing: normalizing [ -1,1] data of each diagnosis quantity;
step 4-2, MRNN network: constructing a multiple regression neural network (Multiple regression neural network, MRNN) for determining the coefficients of the multiple regression equation shown in FIG. 5, wherein the input layer of the MRNN network is provided with 4 inputs, namely a TFP diagnosis quantity, a G (Ld) diagnosis quantity, a G (P) diagnosis quantity and a zeta diagnosis quantity; the hidden layer is 4 layers, the connection with the input layer is the simplest connection of 1 to 1, and the weights are alpha, beta, gamma and delta respectively; the results of the output layer utilize the multiple regression equation:
α*TFP+β*G(Ld)+γ*G(P)+δ*ζ+B=ε
(wherein the coefficients alpha, beta, gamma and delta of each diagnostic quantity represent the weight of the diagnostic quantity, B is bias, epsilon is regression value), summing the four outputs of the hidden layer and adding a bias B to output epsilon, thus providing an initialization model of MRNN for the offline training stage;
step 4-3, initializing a network of MRNN: initializing the MRNN network by using an Xavier initialization method to ensure that coefficients of a multiple regression equation are uniformly distributed in a mode
Figure BDA0002388593940000101
Randomly initializing the range of (1), wherein n is the input latitude of the layer where n is the output latitude;
step 4-4, reading training samples: the method comprises the steps that a batch training mode is adopted, each time of training is performed, a BatchSize group training sample is read from a training sample set (the training sample set is composed of weather element values of different time points of each grid point and artificial frontal surface data values of corresponding time points, wherein each weather element is a temperature field, a pressure field and a wind field, specific values of the weather elements are downloaded in an European middle weather forecast center (ECMWF), and artificial frontal surface data for supervised learning is derived from artificial identification data of an weather table);
step 4-5, MRNN network training: FIG. 6 is a flow chart of coefficient training learning of multiple regression equations, wherein the loss function of the MRNN network is determined first, then the network parameters of the network training are set, the parameters are updated by the adaptive gradient descent optimization method until the training requirements are met, and finally the coefficients of the multiple regression equations (i.e. the weights of the diagnostic quantities) are output.
Step 5 comprises the steps of:
step 5-1, determining a multiple regression equation: using a multiple regression equation (α.tfp+β.g) Ld +γ*G P +delta: +ζ+b=ε) to determine whether each grid point is a frontal surface, where each coefficient of the multiple regression equation represents a weight occupied by each designed diagnostic quantity in the process of automatically identifying the frontal surface, and each coefficient is obtained by training through an MRNN network in step 4;
step 5-2, determining a frontal surface: substituting the meteorological element data into the multiple regression equation determined in the step 5-1, and automatically identifying the frontal surface according to the epsilon value (the frontal surface is more than 0 and the non-frontal surface is less than or equal to 0).
The steps 4-5 comprise the following steps:
step 4-5-1, determining a loss function: the loss function is set as loss= |y-train|, wherein Train is artificial frontal data, and y is output data of the MRNN network;
step 4-5-2, network parameters: the learning rate is 0.001; let the network learning rate lambda=0.001, the number of samples input each time during training period batch size=6, and the maximum batch training times of the training sample set
Figure BDA0002388593940000111
The current batch training frequency BatchNum=1, the maximum iteration frequency of network training ItemATONUM=10, and the current iteration frequency ItemATONUM=1;
step 4-5-3, optimizing and updating: the method is mainly characterized in that the gradient of each iteration of each coefficient is square accumulated and then is squared, and the global learning rate is divided by the number to be used as dynamic updating of the learning rate. The AdaGrad algorithm will use a small random gradient g in batches t Cumulative variable s squared by element t (corresponding to a gradient memory cache). In time step 0, adaGrad will s 0 Each element in (the initial gradient store cache) is initialized to 0. At time step t, a small batch of random gradients g are first applied t The sum is added to the variable s after squaring according to the element t :
s t ←s t-1 +g t ⊙g t
Wherein, the addition is multiplied by element. Next, the learning rate of each element in the objective function argument is readjusted by element-wise operation:
Figure BDA0002388593940000112
wherein eta is a constant for maintaining numerical stability, and ω is a constant for maintaining numerical stability, and the learning rate is generally set to 0.01, and 10 is used in the method -6
Step 4-5-4, outputting coefficients: and outputting multiple regression equation coefficients when the training requirement is met.
As shown in FIG. 7, the method of the present invention further improves the accuracy of the automatic identification of the fronts over a single diagnostic quantity (E is a map of the fronts identified by the method of the present invention).
The invention provides a weather front automatic identification method based on multiple regression, and the method and the way for realizing the technical scheme are numerous, the above is only a preferred embodiment of the invention, and it should be pointed out that a plurality of improvements and modifications can be made to those skilled in the art without departing from the principle of the invention, and the improvements and modifications are also considered as the protection scope of the invention. The components not explicitly described in this embodiment can be implemented by using the prior art.

Claims (3)

1. The automatic weather front identification method based on multiple regression is characterized by comprising the following steps of:
step 1, determining grid points of a constant temperature zone by using a statistical method;
step 2, removing grid points of a constant temperature zone in each meteorological element data set;
step 3, calculating a multi-element diagnosis quantity;
training multiple regression equation coefficients by using a neural network;
step 5, identifying a frontal surface through a regression equation;
step 1 comprises the following steps:
step 1-1, solving a temperature gradient: the temperature data of the grid points within one year are randomly selected, and the temperature gradient of each grid point is calculated by substituting the following formula:
Figure FDA0004122403030000011
wherein X represents latitude, Y represents longitude, T represents temperature, G T Representing a temperature gradient;
step 1-2, determining a temperature gradient threshold value: randomly selecting artificial frontal surface data within one year, solving a temperature gradient mean value of grid points in the artificial frontal surface data, and setting the solved temperature gradient mean value as a temperature gradient threshold value, wherein the temperature gradient threshold value is used for determining the frequency of each grid point;
step 1-3, counting frequency: counting the frequency of the temperature gradient of each grid point being greater than a temperature gradient threshold value;
step 1-4, determining grid points of a constant temperature zone: when the grid point temperature gradient frequency exceeds the set frequency threshold value, the grid point is defined as a constant temperature zone grid point;
the steps 1-4 comprise the following steps:
step 1-4-1, determining a frequency threshold: determining a frequency value as a frequency threshold when the temperature gradient of each grid point of the sea-land junction and the junction of the mountain and the plain is smaller than the temperature gradient threshold determined in the step 1-2;
step 1-4-2, judging grid points of a constant temperature zone: when the frequency of the grid point temperature gradient obtained by dividing the number of times that the grid point temperature gradient is larger than the temperature gradient threshold value in one year by the counted total number of times is larger than the frequency threshold value determined in the step 1-4-1, judging that the grid point is a constant temperature zone grid point; otherwise, the temperature is a very constant temperature zone grid point;
the step 2 comprises the following steps: setting the value of the constant temperature zone grid point to 0 according to the constant temperature zone grid points determined in the step 1, and setting the meteorological element data values of other constant temperature zone grid points to 1; performing AND operation on grid data comprising a temperature field, an air pressure field and a wind field and constant temperature zone data, removing grid points of the constant temperature zone, setting values of the grid points of the constant temperature zone of each meteorological element to 0, and directly judging that the grid points of the constant temperature zone are non-frontal surfaces;
step 3 comprises the following steps:
step 3-1, calculating a thermal front parameter TFP diagnostic quantity according to the following formula:
Figure FDA0004122403030000021
wherein ,
Figure FDA0004122403030000022
for gradient operations, τ is a diagnostic factor, and τ is selected as an pseudo-equivalent temperature θ e Pseudo-equivalent temperature theta e The value of (2) is calculated by the following formula:
Figure FDA0004122403030000023
wherein T represents temperature, P represents sea level pressure, P 00 Represents standard atmospheric pressure, P 00 =1000hPa,R d Is the gas constant of dry air, and R d =287.05J/kg/K,C p Constant pressure specific heat of dry air, and C p =1005.7J/kg/K;C l Is the specific heat capacity of liquid water; l represents the latent heat of condensation per unit mass, r T For mixing ratio at T temperature, r s Is a saturated mixing ratio, and r s The calculation formula is that
Figure FDA0004122403030000024
wherein es Is saturated water vapor pressure;
step 3-2, calculating a dew point temperature diagnostic quantity G Ld
Firstly, the temperature and dew point temperature difference is obtained, ld represents the dew point temperature, then the temperature and dew point temperature difference TL=T-Ld, and then the dew point temperature diagnosis quantity is calculated by utilizing a formula:
Figure FDA0004122403030000025
wherein X represents latitude, Y represents longitude, G Ld A gradient indicating a temperature and dew point temperature difference, i.e., a dew point temperature diagnostic quantity;
step 3-3, calculating the air pressure diagnostic quantity G according to the following formula P
Figure FDA0004122403030000026
wherein ,GP Represents the sea level air pressure gradient, P represents the sea level air pressure,
Figure FDA0004122403030000027
representing minor changes in latitudinal direction, +.>
Figure FDA0004122403030000028
Representing a small change in the longitudinal direction;
step 3-4, calculating wind field diagnosis quantity ζ:
Figure FDA0004122403030000029
wherein X represents a latitude direction, and Y represents a longitude direction; u, V are the wind speed component of the wind field in the 10-meter longitude direction and the wind speed component in the 10-meter latitude direction respectively;
step 4 comprises the steps of:
step 4-1, data preprocessing: for the TFP diagnostic quantity and dew point temperature diagnostic quantity G obtained in the step 3 Ld Diagnostic amount of barometric pressure G P And wind field diagnostic quantity zeta [ -1,1 []Is normalized by the normalization process;
step 4-2, constructing a multiple regression neural network MRNN network for determining each coefficient of the multiple regression equation, wherein the input layer of the MRNN network is 4 inputs which are respectively TFP diagnosis quantity and dew point temperature diagnosis quantity G Ld Diagnostic amount of barometric pressure G P And a wind farm diagnostic amount ζ; the hidden layer is 4 layers, the connection with the input layer is 1 to 1 connection, and the TFP diagnosis amount and the dew point temperature diagnosis amount G Ld Diagnostic amount of barometric pressure G P And the wind field diagnostic quantity zeta has the weights of alpha, beta, gamma and delta respectively; the results of the output layer were obtained by the following multiple regression equation:
α*TFP+β*G Ld +γ*G P +δ*ζ+B=ε
wherein B is bias and epsilon is regression value;
in the step 4-3 of the method,initializing an MRNN network: initializing the MRNN network by using an Xavier initialization method to ensure that coefficients of a multiple regression equation are uniformly distributed in a mode
Figure FDA0004122403030000031
Randomly initializing the range of (1), wherein n is the input latitude of the layer where n is the output latitude;
step 4-4, reading training samples: a batch training mode is adopted, and a BatchSize group training sample is read from a training sample set in each training;
step 4-5, training the MRNN network: and determining a loss function of the MRNN network, updating parameters by using a self-adaptive gradient descent optimization method until the training requirement is met, and outputting multiple regression equation coefficients.
2. The method according to claim 1, wherein steps 4-5 comprise the steps of:
step 4-5-1, determining a loss function: the loss function is set as loss= |y-train|, wherein Train is artificial frontal data, and y is output data of the MRNN network;
step 4-5-2, determining network parameters: let the network learning rate lambda=0.001, the number of samples input each time during training period batch size=6, and the maximum batch training times of the training sample set
Figure FDA0004122403030000032
The current batch training frequency BatchNum=1, the maximum iteration frequency of network training ItemATONUM=10, and the current iteration frequency ItemATONUM=1;
step 4-5-3, updating coefficients by using an AdaGrad algorithm as an optimization method: using a small random gradient g t Cumulative variable s squared by element t At time step 0, the gradient memory buffer s at the beginning will be stored 0 Initializing each element of (2) to 0; at time step t, a small batch of random gradients g are first applied t The sum is added to the variable s after squaring according to the element t
s t ←s t-1 +g t ⊙g t
Wherein +.:
Figure FDA0004122403030000041
where η is the learning rate and ω is a constant added to maintain numerical stability;
step 4-5-4, outputting coefficients: and outputting multiple regression equation coefficients when the training requirement is met.
3. The method according to claim 2, characterized in that step 5 comprises the steps of:
step 5-1, determining a multiple regression equation: each coefficient of the multiple regression equation is obtained through MRNN network training;
step 5-2, determining a frontal surface: inputting the meteorological element data set of each grid point and each coefficient of the multiple regression equation into the multiple regression model:
α*TFP+β*G Ld +γ*G P +δ*ζ+B=ε,
and finally outputting a frontal surface identification result, wherein the value of epsilon represents the probability value of the frontal surface.
CN202010106401.0A 2020-02-21 2020-02-21 Meteorological frontal surface automatic identification method based on multiple regression Active CN111414991B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010106401.0A CN111414991B (en) 2020-02-21 2020-02-21 Meteorological frontal surface automatic identification method based on multiple regression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010106401.0A CN111414991B (en) 2020-02-21 2020-02-21 Meteorological frontal surface automatic identification method based on multiple regression

Publications (2)

Publication Number Publication Date
CN111414991A CN111414991A (en) 2020-07-14
CN111414991B true CN111414991B (en) 2023-04-25

Family

ID=71490881

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010106401.0A Active CN111414991B (en) 2020-02-21 2020-02-21 Meteorological frontal surface automatic identification method based on multiple regression

Country Status (1)

Country Link
CN (1) CN111414991B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112765832B (en) * 2021-02-02 2022-05-06 南京信息工程大学 Automatic identification and correction method for continental europe
CN113063906B (en) * 2021-02-09 2022-03-25 国家海洋局南海预报中心(国家海洋局广州海洋预报台) Method and device for detecting chlorophyll a front surface
CN114594532B (en) * 2022-03-09 2024-06-14 北京墨迹风云科技股份有限公司 Cold weather prediction method and device, electronic equipment and computer readable medium
CN114565057B (en) * 2022-03-15 2022-10-21 中科三清科技有限公司 Machine learning-based grading field identification method and device, storage medium and terminal
CN114565056B (en) * 2022-03-15 2022-09-20 中科三清科技有限公司 Machine learning-based cold-front identification method and device, storage medium and terminal
CN115878731B (en) * 2022-11-17 2024-03-08 南京信息工程大学 Automatic warm front identification method
CN116577844B (en) * 2023-03-28 2024-02-09 南京信息工程大学 Automatic east Asia cold front precipitation identification method and system
CN116681959B (en) * 2023-06-09 2024-03-19 中科三清科技有限公司 Machine learning-based frontal line identification method and device, storage medium and terminal
CN116933014B (en) * 2023-09-14 2023-11-28 成都信息工程大学 Automatic identification method for dry type Kunming quasi-static front
CN117853949B (en) * 2024-03-07 2024-05-14 南京信息工程大学 Deep learning method and system for identifying cold front by using satellite cloud image

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107274420A (en) * 2017-06-15 2017-10-20 中国水产科学研究院东海水产研究所 The oceanic front extracting method split based on image
CN109086818A (en) * 2018-07-25 2018-12-25 中国海洋大学 Oceanic front recognition methods and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8315961B2 (en) * 2009-07-14 2012-11-20 Mitsubishi Electric Research Laboratories, Inc. Method for predicting future environmental conditions
US8930299B2 (en) * 2010-12-15 2015-01-06 Vaisala, Inc. Systems and methods for wind forecasting and grid management

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107274420A (en) * 2017-06-15 2017-10-20 中国水产科学研究院东海水产研究所 The oceanic front extracting method split based on image
CN109086818A (en) * 2018-07-25 2018-12-25 中国海洋大学 Oceanic front recognition methods and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Yan Huang等.Objective Identification of Trough Lines Using Gridded Wind Field Data.atmosphere.2017,1-15. *
王丽颖.冬季北太平洋海盆尺度海洋锋变异及其对大气的影响.中国博士学位论文全文数据库基础科学辑.2018,(第03期),A010-1. *

Also Published As

Publication number Publication date
CN111414991A (en) 2020-07-14

Similar Documents

Publication Publication Date Title
CN111414991B (en) Meteorological frontal surface automatic identification method based on multiple regression
CN107918165B (en) More satellites based on space interpolation merge Prediction of Precipitation method and system
CN106598917B (en) A kind of upper ocean heat structure prediction technique based on deepness belief network
Hashino et al. The Spectral Ice Habit Prediction System (SHIPS). Part I: Model description and simulation of the vapor deposition process
Davy et al. Statistical downscaling of wind variability from meteorological fields
Schmid et al. The influence of surface texture on the effective roughness length
CN111665575B (en) Medium-and-long-term rainfall grading coupling forecasting method and system based on statistical power
CN103617462B (en) A kind of wind farm wind velocity Spatiotemporal Data Modeling method based on geographical statistics
CN108062595B (en) WRF/CFD/SAHDE-RVM coupling-based short-time wind energy prediction method for complex landform area
CN110598290A (en) Method and system for predicting future hydropower generation capacity of basin considering climate change
CN114444378A (en) Short-term power prediction method for regional wind power cluster
CN113435630B (en) Basin hydrological forecasting method and system with self-adaptive runoff yield mode
Ukkonen et al. Evaluation of thunderstorm predictors for Finland using reanalyses and neural networks
CN108154271A (en) A kind of surface air temperature method of quality control based on spatial coherence and surface fitting
CN111445085A (en) Medium-and-long-term runoff forecasting method considering influence of medium-and-large-sized reservoir engineering water storage
CN110826526A (en) Method for cloud detection radar to identify clouds
CN107577896B (en) Wind power plant multi-machine aggregation equivalent method based on hybrid Copula theory
Liou et al. Thermodynamic recovery of the pressure and temperature fields over complex terrain using wind fields derived by multiple-Doppler radar synthesis
Jie et al. Improvement of 6–15 day precipitation forecasts using a time-lagged ensemble method
CN117787081A (en) Hydrological model parameter uncertainty analysis method based on Morris and Sobol methods
CN105302980B (en) A kind of city aerodynamic roughness inversion method based on SAR data
Moser et al. Cloud-spacing effects upon entrainment and rainfall along a convective line
Noor et al. Prediction map of rainfall classification using random forest and inverse distance weighted (IDW)
Andreas Two experiments on using a scintillometer to infer the surface fluxes of momentum and sensible heat
CN111881538B (en) Inversion method for water vapor air guide

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant