CN113821986A - Vortex sea surface signal and underwater key parameter inversion method based on random forest - Google Patents
Vortex sea surface signal and underwater key parameter inversion method based on random forest Download PDFInfo
- Publication number
- CN113821986A CN113821986A CN202111382855.1A CN202111382855A CN113821986A CN 113821986 A CN113821986 A CN 113821986A CN 202111382855 A CN202111382855 A CN 202111382855A CN 113821986 A CN113821986 A CN 113821986A
- Authority
- CN
- China
- Prior art keywords
- vortex
- data
- underwater
- value
- depth
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/08—Probabilistic or stochastic CAD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2113/00—Details relating to the application field
- G06F2113/08—Fluids
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/08—Thermal analysis or thermal optimisation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Algebra (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Testing Or Calibration Of Command Recording Devices (AREA)
Abstract
A vortex sea surface signal and underwater key parameter inversion method based on random forest is disclosed, which solves intersection of Argo buoy data and SLA-based vortex identification and tracking data set; counting the correlation between the main parameters of the vortex identification data in the intersection and the underwater temperature abnormal data to obtain correlation parameters; predicting an abnormal extreme value of the underwater temperature based on a random forest algorithm and by combining the obtained correlation parameters; and carrying out statistical analysis on the depth of the abnormal extreme value of the underwater temperature based on the probability density function, and obtaining probability distribution data of the underwater depth. According to the method, the vortex underwater temperature abnormal extreme value and the information of the depth of the vortex underwater temperature abnormal extreme value are inverted by using a random forest algorithm according to vortex sea surface layer signals, and the inversion result and actual data are subjected to accuracy test, so that the inversion and accuracy statistics of global vortex underwater temperature abnormal key parameters can be completed, and the method has a reference value for underwater research of ocean mesoscale vortex.
Description
Technical Field
The invention belongs to the technical field of ocean information, and particularly relates to a vortex sea surface signal and underwater key parameter inversion method based on random forests.
Background
Ocean vortexes are widely distributed in the global ocean, play an important role in the transmission and mixing of substances, energy, heat and the like in the ocean, and have very important influence on ocean ecology, multi-scale movement, ocean climate change, atmospheric environment and the like in the ocean. And mesoscale vortices are one of common phenomena in the ocean, and the radius interval of the mesoscale vortices is 10-100km, which is one of important components of the mesoscale phenomenon in the ocean. The life cycle of mesoscale vortices is typically as long as a few weeks to months, and in some cases even years. The mesoscale vortex can move dozens of kilometers to hundreds of kilometers in the life cycle, carries strong kinetic energy, accounts for more than half of the global ocean circulation kinetic energy, plays an important role in transportation and exchange of substances in the ocean and transportation of heat and momentum, and has profound influence on the environment and climate change in the ocean. Therefore, the research of the mesoscale vortexes in the ocean has important scientific significance and application value.
The important premise of the ocean vortex stereo research is to accurately acquire structural information of the three-dimensional flow field, the temperature and the salinity, the biochemical parameters and the like which change along with time. The existing observation means for the mesoscale ocean vortex mainly comprises a remote sensing satellite, a scientific investigation ship, a buoy, a submersible, an Argo observation network and the like. The satellite altimeter can obtain quasi-synchronous marine observation data covered globally, can realize identification and detection of mesoscale vortexes, and can perform statistics and analysis on characteristics and evolution of the mesoscale vortexes; the Argo global observation plan can acquire profile data of temperature, salinity and other attribute information on a global ocean vertical structure, and the development of an underwater observation platform represented by an Argo buoy enables the research of mesoscale vortexes to become the forefront ocean hotspot again. However, at present, oceanologists have few researches on the relationship between the sea surface signal of the mesoscale vortex and the vortex underwater observation signal, and cannot sense the underwater signal by only depending on a remote sensing satellite.
Therefore, it is urgent to explore an algorithm capable of inverting the underwater key parameters based on vortex sea surface signals.
Disclosure of Invention
The invention provides a random forest based vortex sea surface signal and underwater key parameter inversion method to make up for the defects of the prior art.
The invention utilizes Argo buoy data for more than 20 years and SLA-based vortex identification and tracking data sets to perform statistical analysis on the basic characteristics of the global mesoscale vortices. On the basis, by combining with the underwater temperature abnormal data of the Argo buoy, performing key parameter inversion on the vortex surface characteristics and the underwater temperature abnormality based on a random forest algorithm (the key parameters of the underwater temperature abnormality are mainly an underwater temperature abnormal extreme value and the depth of the temperature abnormal extreme value).
In order to achieve the purpose, the invention adopts the following technical scheme, and the specific steps are as follows:
a vortex sea surface signal and underwater key parameter inversion method based on random forests comprises the following steps:
(1) intersection is solved for Argo buoy data and SLA-based vortex identification and tracking data sets;
(2) counting the correlation between the main parameters of the vortex identification data in the intersection and the underwater temperature abnormal data to obtain correlation parameters;
(3) predicting an abnormal extreme value of the underwater temperature based on a random forest algorithm and in combination with the correlation parameters obtained in the step (1);
(4) and carrying out statistical analysis on the depth of the abnormal extreme value of the underwater temperature based on the probability density function, and obtaining probability distribution data of the underwater depth.
Further, the Argo buoy data collection in said step (1) takes more than 20 years.
Further, the intersection in step (1) is specifically:
1-1, calculating the intersection point of the extension line of the connecting line of the buoy and the vortex center and the vortex boundary
Let the longitude and latitude coordinates of the vortex center be (x 0, y 0) and the longitude and latitude coordinates of the Argo buoy be (x 1, y 1)
If x1-x0>0:Call outcontains_point()Judging whether the intersection point (x, y) is in the vortex by the function, if so, outputting the longitude and latitude coordinates (x, y) of the intersection point;
if x1-x0<0:Call outcontains_point()Judging whether the intersection point (x, y) is in the vortex by the function, if so, outputting the longitude and latitude coordinates (x, y) of the intersection point;
1-2, calculating the distance between the intersection point and the vortex center
Let the longitude and latitude coordinates of the vortex center be (x 0, y 0), the longitude and latitude coordinates of the Argo buoy be (x 1, y 1), and the longitude and latitude coordinates of the intersection point be (x, y)
Finally, the total distance between the intersection point and the vortex center is obtained by using a planar rectangular diagonal distance formulaThe total distance between the buoy and the vortex center can be obtained by converting the coordinates of the intersection points in the formula into the coordinates of the buoy;
1-3, extracting Argo buoy located inside vortex
The total distance between the intersection point and the vortex center is L1, the total distance between the buoy and the vortex center is L2,if a<And if the temperature is not less than 0, judging that the Argo buoy is positioned in the vortex, traversing Argo buoy data by the method to obtain an intersection data set of Argo temperature abnormal data and SLA-based vortex identification data.
Further, in the step (2), the correlation parameters include radius, amplitude and kinetic energy of the vortex sea surface signal.
Further, the step (2) is specifically as follows:
2-1, first, a filter function is introduced, and a low-pass filter function is used to determineThe initial conditions of (2) will also eliminate data transients;it is operated under a least-squares framework,
the initial parameter settings were as follows:
2-2, calling a kernel density estimation function, wherein the Gaussian kernel density estimation formula is as follows: the parameters are set as follows:
h is the bandwidth, experimental bandwidth h =200, here bandwidth as a smoothing parameter, the trade-off between variance and deviation in the control results. A large bandwidth results in a very smooth density distribution, a small bandwidth results in a non-smooth (i.e., high variance) density distribution; and obtaining a two-dimensional array table after the temperature anomaly and the depth data are subjected to Gaussian kernel density processing, returning index values of the two-dimensional array values from small to large by using an index function to obtain a temperature anomaly extreme value, depth and index values x, y and z thereof, and arranging the x and y according to the ascending order of the z to obtain the two-dimensional index table of the global vortex sea surface signal and the underwater temperature anomaly data.
Further, in the step (3): and (3) importing the two-dimensional index table file of the vortex and underwater characteristic information obtained in the step (2) into a random forest algorithm, and predicting an underwater temperature abnormal extreme value by using the radius, kinetic energy and amplitude of the vortex as main input quantities and information such as the distance between a buoy and a vortex center, the longitude and the latitude as auxiliary input quantities.
Further, the step (3) is specifically:
3-1, firstly, grid searching is used for the two-dimensional index table file, namely grid searching and cross validation; and adopting random forest prediction on the training data set to obtain a prediction result of the temperature abnormal extreme value.
The training set partition formula is as follows:
the data is a normal distribution data set with the mean value of 0 and the variance of 1, and the variable number m is a positive integer;
normal distribution data set from average setAfter iteration, ρ is from [0, 1%]Is selected from the uniform distribution of (a).
β m Is a subset of the number of the bits in the bit stream,kis the clustering center, h is the grouping coefficient;
3-2, fitting the training data set, and then performing standard normalization processing; the standard normalization uses Z-score normalization, which is based on the idea of normalizing all data into a distribution with a mean of 0 and a variance of 1, and the formula is as follows:
wherein Xmean is the mean of the features; σ: standard deviation of each set of eigenvalues; x: each of the characteristic values; xscale: normalizing the characteristic value;
3-3, performing random forest prediction on the training data set to obtain an underwater temperature abnormal extreme value inversion result based on vortex sea surface signals, wherein the random forest algorithm formula is as follows:
wherein the content of the first and second substances,is a predictor, x is an input value, and y is an output value
The distribution of the input value x and the predicted value y can be obtained by integrating the two sides of the formula, and the mean square error can be obtained at the same timeMean square errorAnd obtaining a final temperature anomaly predicted value y through processing of a random forest prediction algorithm.
Further, in the step (4), vortex radius, amplitude, kinetic energy, temperature anomaly and depth information of the temperature anomaly in the intersection data set of the Argo temperature anomaly data obtained in the step (2) and the SLA-based vortex identification data are used as main input quantities, and the probability density function of kernel density estimation is used for predicting the depth of the extreme value of the underwater temperature anomaly.
Furthermore, in the step (4), if the cumulative distribution function of the experimental data is f (x) and the probability density function is f (x), the following steps are performed:
empirical distribution function introducing cumulative distribution function:
using x in n observationsiThe ratio of the number of occurrences of t ≦ to n to approximately describe P (x ≦ t), and the function is substituted for f (x ≦ t)i) Is obtained by
h is also called the bandwidth in the kernel density estimation, the value of h cannot be too large or too small, the condition of h → 0 cannot be satisfied if the value of h is too large, too small a sample data point is used too little, and the error can be large.
Bandwidth selection is based on the formula:
wherein, σ is the standard deviation of the sample, n is the number of the samples, and through experimental analysis, the bandwidth h =5 adopted in the experiment.
After determining the bandwidth, writing an expression of f (x):
based on the radius, kinetic energy and amplitude of the vortex as main input quantities, the distribution condition of the depth of the underwater temperature abnormal extreme value can be predicted by using a probability density function of kernel density estimation, and meanwhile, the statistical probability distribution of the depth is obtained.
The invention has the advantages and beneficial effects that:
the invention discloses a method for inverting key parameters of underwater temperature abnormity through vortex sea surface signals based on random forest prediction, which has complete theoretical support and more intuitively analyzes the influence of vortex sea surface layer signals on the key parameters of the underwater temperature abnormity by using the underwater data information of Argo buoy data in a vortex on the aspect of big data mining.
According to the method, the vortex underwater temperature abnormal extreme value and the information of the depth of the vortex underwater temperature abnormal extreme value are inverted by using a random forest algorithm according to vortex sea surface layer signals, and the inversion result and actual data are subjected to accuracy test, so that the inversion and accuracy statistics of global vortex underwater temperature abnormal key parameters can be completed, and the method has a reference value for underwater research of ocean mesoscale vortex.
The invention has very high research and application values, and is not only embodied in theoretical research such as ocean dynamics, but also embodied in practical application such as national defense and military.
Drawings
FIG. 1 is a basic flow diagram of the present invention.
FIG. 2 is a graph of the dependence of swirl radius on extreme temperature anomaly.
FIG. 3 is a graph of the dependence of vortex amplitude on temperature anomaly extremes.
FIG. 4 is a graph of dependence of vortical kinetic energy on extreme temperature anomalies.
FIG. 5 is a global random forest prediction temperature anomaly accuracy distribution diagram.
FIG. 6 is a diagram of the mean square error of global random forest prediction temperature anomaly.
FIG. 7 is a global depth map.
Fig. 8 global depth probability distribution map.
Detailed Description
The invention will be further explained and illustrated by means of specific embodiments and with reference to the drawings.
Example 1:
a vortex sea surface signal and underwater temperature anomaly key parameter inversion method based on random forests is disclosed, and the flow is shown in figure 1. The specific operation comprises the following steps:
1. intersecting Argo buoy data and SLA-based vortex identification and tracking data sets for more than 20 years;
sources of Argo buoy datasets: http:// www.argo.ucsd.edu.
SLA-based vortex identification and tracking data sets and cars2009 climate state data open source data for this practitioner.
Firstly, data preprocessing is carried out on an Argo buoy data set, and attribute information of 0-1000m of Argo buoy data is extracted, wherein the attribute information comprises the following steps: longitude, latitude, time, pressure, temperature, salinity. In the embodiment, difference processing is carried out on data of 0-1000m underwater of the Argo buoy and the data of the cars2009 climate state to obtain an Argo temperature abnormal data set.
Traversing the Argo temperature abnormal data set according to time, longitude and latitude by using the SLA-based vortex identification data set, and under the same time condition, if the longitude and latitude of the Argo buoy fall in a vortex, considering that the data detected by the buoy at the moment is in-vortex data, traversing the Argo buoy data for 20 years by the method to obtain an intersection data set of the Argo temperature abnormal data and the SLA-based vortex identification data, wherein the specific intersection solving method comprises the following steps:
intersection point of extension line of connecting line of buoy and vortex center and vortex boundary
Let the longitude and latitude coordinates of the vortex center be (x 0, y 0) and the longitude and latitude coordinates of the Argo buoy be (x 1, y 1)
If it is: ,Use ofcontains_point()Judging whether the intersection point (x, y) is in the vortex by the function, if so, outputting the longitude and latitude coordinates (x, y) of the intersection point;
if it is: ,Use ofcontains_point()Judging whether the intersection point (x, y) is in the vortex by the function, if so, outputting the longitude and latitude coordinates (x, y) of the intersection point;
Let the longitude and latitude coordinates of the vortex center be (x 0, y 0), the longitude and latitude coordinates of the Argo buoy be (x 1, y 1), and the longitude and latitude coordinates of the intersection point be (x, y)
Finally, the total distance between the intersection point and the vortex center is obtained by using a planar rectangular diagonal distance formulaThe total distance between the buoy and the vortex center can be obtained by converting the coordinates of the intersection points in the formula into the coordinates of the buoy。
The total distance between the intersection point and the vortex center isL1The total distance between the buoy and the vortex center isL2,If, ifAnd judging that the Argo buoy is positioned in the vortex, and traversing Argo buoy data by the method to obtain an intersection data set of Argo temperature abnormal data and SLA-based vortex identification data.
1. Counting the correlation between the main parameters of the vortex identification data in the intersection and the underwater temperature abnormal data to obtain the correlation main parameters;
firstly, a filter function is introduced, low-pass filtering is selected and called, and the determination of the filter function is carried out by using the low-pass filtering functionThe initial conditions of (2) will also eliminate data transients;the method works under a least square framework, provides a feasible implementation method for explaining the filtered white noise, and is effective to the filtering effect of the experimental data.
The initial parameter settings were as follows:
the standardized cutting frequency of the experiment is set to be 0.2 and 0.25, the length of the noise sequence is 200, and a temperature abnormity filtering value T1 of an Argo section is obtained after operation. If the data is warm vortex data, a maximum function is used to obtain a depth index value T2 where the maximum value of the temperature anomaly is located, and if the data is cold vortex data, a minimum function is used to obtain a depth index value where the maximum value of the temperature anomaly is located.
Then, a kernel density estimation function is called, and the Gaussian kernel density estimation formula is as follows: the parameters are set as follows:
h is the bandwidth, experimental bandwidth h =200, here bandwidth as a smoothing parameter, the trade-off between variance and deviation in the control results. A large bandwidth results in a very smooth density distribution, a small bandwidth results in a non-smooth (i.e., high variance) density distribution; and obtaining a two-dimensional array table after the temperature anomaly and the depth data are subjected to Gaussian kernel density processing, returning index values of the two-dimensional array values from small to large by using an index function to obtain a temperature anomaly extreme value, depth and index values x, y and z thereof, and arranging the x and y according to the ascending order of the z to obtain the two-dimensional index table of the global vortex sea surface signal and the underwater temperature anomaly data. As shown in table 1, a two-dimensional index table for intercepting global vortex sea surface signals and underwater temperature anomaly data of a part of area is provided.
TABLE 1 two-dimensional index table for intercepting global vortex sea surface signals and underwater temperature anomaly data of partial areas
Radius of vortex | Vortex kinetic energy | Amplitude of vortex | Distance of buoy from vortex center | Buoy longitude | Buoy latitude | Longitude of vortex center | Vortex core latitude | Temperature anomaly | Depth of field |
84.40 | 23.05 | 3.40 | 0.73 | 18.02 | 37.29 | 17.93 | 36.78 | 0.37 | 30.00 |
88.94 | 33.67 | 3.85 | 0.17 | 18.04 | 36.90 | 18.10 | 36.83 | 0.69 | 95.00 |
78.91 | 28.38 | 3.76 | 0.57 | 18.14 | 37.19 | 18.35 | 36.87 | 0.43 | 36.00 |
57.05 | 11.57 | 1.58 | 0.92 | 12.23 | 38.77 | 12.57 | 39.03 | 1.02 | 41.00 |
106.08 | 10.63 | 2.65 | 0.99 | 12.16 | 38.67 | 12.26 | 39.15 | 0.61 | 50.00 |
84.52 | 94.62 | 7.04 | 0.54 | 18.53 | 34.16 | 18.80 | 34.38 | 0.85 | 410.00 |
90.33 | 19.90 | 2.99 | 0.58 | 17.44 | 36.68 | 17.94 | 36.72 | 0.49 | 83.00 |
90.14 | 44.81 | 3.35 | 0.98 | 18.84 | 32.64 | 17.96 | 33.74 | 0.84 | 353.00 |
114.63 | 68.02 | 5.75 | 0.77 | 18.87 | 32.72 | 17.84 | 33.96 | 1.03 | 402.00 |
63.55 | 60.66 | 3.67 | 0.68 | 18.15 | 32.42 | 18.32 | 32.66 | 1.11 | 466.00 |
110.18 | 93.86 | 6.06 | 0.09 | 18.05 | 33.54 | 18.17 | 33.50 | 0.45 | 135.00 |
84.19 | 83.76 | 4.63 | 0.45 | 18.20 | 33.33 | 18.42 | 33.04 | 0.63 | 161.00 |
88.28 | 85.61 | 5.04 | 0.71 | 18.37 | 32.50 | 18.40 | 33.04 | 1.15 | 479.00 |
88.55 | 66.56 | 5.14 | 0.41 | 18.36 | 32.53 | 18.42 | 32.80 | 1.10 | 450.00 |
82.79 | 86.43 | 5.30 | 0.41 | 18.53 | 32.66 | 18.45 | 32.90 | 1.06 | 440.00 |
99.26 | 49.71 | 3.33 | 0.37 | 18.57 | 32.65 | 18.44 | 32.83 | 0.91 | 398.00 |
91.08 | 49.44 | 4.18 | 0.32 | 17.98 | 33.54 | 17.83 | 33.90 | 0.42 | 123.00 |
108.26 | 56.27 | 5.04 | 0.84 | 18.79 | 32.78 | 17.67 | 33.98 | 1.04 | 460.00 |
103.56 | 69.07 | 5.75 | 0.50 | 17.78 | 33.47 | 17.61 | 34.04 | 0.64 | 128.00 |
86.35 | 85.41 | 5.14 | 0.49 | 18.15 | 33.30 | 18.41 | 32.98 | 0.58 | 92.00 |
118.14 | 93.16 | 7.53 | 0.51 | 17.83 | 33.46 | 17.53 | 34.15 | 0.75 | 160.00 |
103.01 | 75.18 | 5.45 | 0.44 | 17.88 | 33.48 | 17.86 | 33.95 | 0.68 | 150.00 |
120.09 | 56.93 | 4.56 | 0.51 | 18.14 | 33.32 | 18.43 | 32.81 | 0.63 | 162.00 |
54.91 | 49.99 | 2.75 | 0.86 | 18.29 | 33.35 | 18.42 | 32.93 | 0.58 | 140.00 |
112.64 | 109.06 | 6.76 | 0.94 | 18.41 | 32.39 | 18.36 | 33.10 | 0.35 | 135.00 |
90.13 | 76.17 | 6.95 | 0.86 | 17.22 | 36.00 | 18.04 | 35.24 | 0.04 | 123.00 |
105.94 | 72.58 | 7.69 | 0.37 | 18.18 | 35.92 | 17.80 | 35.37 | 0.23 | 39.00 |
102.46 | 72.86 | 8.28 | 0.83 | 17.58 | 36.01 | 18.06 | 35.19 | 0.47 | 53.00 |
107.62 | 71.25 | 7.33 | 0.71 | 18.39 | 35.92 | 17.77 | 35.38 | 0.03 | 30.00 |
68.48 | 24.45 | 2.20 | 0.58 | 16.92 | 32.36 | 17.14 | 32.66 | 1.30 | 127.00 |
92.34 | 61.26 | 6.47 | 0.53 | 17.82 | 35.85 | 18.13 | 35.19 | 0.35 | 46.00 |
102.36 | 61.56 | 7.18 | 0.66 | 17.53 | 35.92 | 18.03 | 35.25 | 0.35 | 45.00 |
90.13 | 77.00 | 7.11 | 0.58 | 18.03 | 35.94 | 17.88 | 35.26 | 0.16 | 44.00 |
85.69 | 85.33 | 7.05 | 0.45 | 17.96 | 35.81 | 18.05 | 35.31 | 0.14 | 48.00 |
98.36 | 91.05 | 5.80 | 0.78 | 19.38 | 35.53 | 18.88 | 35.78 | 0.99 | 252.00 |
84.82 | 117.20 | 7.91 | 0.79 | 18.65 | 35.35 | 19.07 | 35.63 | 1.09 | 277.00 |
100.90 | 112.23 | 7.23 | 0.40 | 19.30 | 35.79 | 18.93 | 35.82 | 1.14 | 367.00 |
41.10 | 18.11 | 1.06 | 0.37 | 18.79 | 32.28 | 18.90 | 32.21 | 1.08 | 81.00 |
61.49 | 12.24 | 1.71 | 0.90 | 11.18 | 38.81 | 11.75 | 38.72 | 0.56 | 38.00 |
68.88 | 64.02 | 3.61 | 0.16 | 18.64 | 35.78 | 18.70 | 35.73 | 0.89 | 59.00 |
133.90 | 142.78 | 7.20 | 0.74 | 17.52 | 34.11 | 17.49 | 33.66 | 0.50 | 64.00 |
And (3) carrying out drawing analysis on the global vortex sea surface signal and the underwater temperature abnormal data according to the index table data to obtain that the correlation between the radius, the amplitude and the kinetic energy of the vortex sea surface signal and the underwater temperature abnormality is obvious, and FIGS. 2, 3 and 4 are correlation graphs of vortex radius, vortex amplitude and vortex kinetic energy and a temperature abnormal extreme value respectively.
2. And (3) predicting an abnormal extreme value of the underwater temperature based on a random forest algorithm and the correlation main parameters obtained in the step (2), and obtaining prediction precision and root mean square error.
And (3) importing the two-dimensional index table file of the vortex and underwater characteristic information obtained in the step (2) into a random forest algorithm, and predicting an underwater temperature abnormal extreme value by using the radius, kinetic energy and amplitude of the vortex as main input quantities and using information such as the distance between a buoy and a vortex center, the longitude and latitude and the like as auxiliary input quantities. The specific operation is as follows:
first, grid search (GridSearchCV) is used on the two-dimensional index table file, namely grid search and cross validation. And grid searching, namely adjusting the parameters in sequence according to the step length in a specified parameter range, training a learner by utilizing the adjusted parameters, and finding the parameter with the highest precision on the verification set from all the parameters. GridSearchCV can guaranteeThe parameters with the highest precision are found in the specified parameter range, and once the best model is found, the model is retrained on the whole training set in the cross validation process, and the performance of the model is improved by using more data sets. The best parameters on the whole training set can be obtained by using grid search on the file of the two-bit index table.
The training model parameters are set as follows:train_test_split(x,y,test_size=0.2,random_state= 50)the function, x is the input data, y is the training object data, the sample fraction test _ size is set to 0.2, and the seed random _ state of the random number is set to 50. And adopting random forest prediction on the training data set to obtain a prediction result of the temperature abnormal extreme value.
The training set partition formula is as follows:
Normal distribution data set from average setAfter iteration, ρ is from [0, 1%]Is selected from the uniform distribution of (a).
β m Is a subset of the number of the bits in the bit stream,kis the clustering center, h is the grouping coefficient, and h =15 in this experiment.
And fitting the training data set by using a data preprocessing function, and then performing standard normalization processing on the data by using a normalization function. The standard normalization uses Z-score normalization, which is based on the idea of normalizing all data into a distribution with a mean of 0 and a variance of 1, and the formula is as follows:
wherein Xmean is the mean of the features (the mean is the average); σ: standard deviation of each set of eigenvalues; x: each of the characteristic values; xscale: and (5) normalizing the characteristic value.
And performing random forest prediction on the training data set by using a random forest function to obtain an inversion result of the abnormal extreme value of the underwater temperature based on the vortex sea surface signal, wherein the random forest algorithm formula is as follows:
wherein the content of the first and second substances,is a predictor, x is an input value, and y is an output value
The distribution of the input value x and the predicted value y can be obtained by integrating the two sides of the formula, and the mean square error can be obtained at the same timeMean square errorAnd obtaining a final temperature anomaly predicted value y through processing of a random forest prediction algorithm.
The following formulas are used to obtain the random forest prediction accuracy R2, the mean square error mse of the random forest prediction accuracy,
the closer the accuracy value is to 1, the better the prediction accuracy,called mean square error, is the true-predicted value, then squared and averaged to measure the deviation between the observed value and the true value.
And converting the calculated data into a two-dimensional array form, and drawing by python to obtain a global random forest predicted temperature anomaly accuracy distribution diagram (figure 5) and a global random forest predicted temperature anomaly mean square error diagram (figure 6). As can be seen from the figure, the accuracy of predicting the underwater temperature anomaly by using the random forest on the global level is over 0.6, the fitting performance is good, and the accuracy reaches about 1 in the northwest pacific and indian ocean areas, so that the fitting performance is excellent; the mean value of the root mean square error of the global layer is about 0.8, which meets the error requirement and shows that the distribution condition of the abnormal extreme values of the underwater temperature can be inverted by using a random forest algorithm.
3. And carrying out statistical analysis on the depth of the abnormal extreme value of the underwater temperature based on the probability density function, and obtaining the probability distribution data of the depth.
And (3) importing vortex radius, amplitude, kinetic energy and temperature anomaly in the intersection data set of the Argo temperature anomaly data obtained in the step (2) and the SLA-based vortex identification data and depth information of the temperature anomaly, and predicting the depth of the underwater temperature anomaly extreme value by using a probability density function of kernel density estimation based on the vortex radius, kinetic energy and amplitude as main input quantities.
The kernel density estimation is a nonparametric method for estimating a probability density function of a random variable, is a density estimation method for continuous data, is obtained according to the mutual relation of the data, and does not need to make assumptions on data distribution. Given a bandwidth H, each sample is fitted by a smooth kernel function, the density value of a certain data can be regarded as the average influence of all other samples on the data, and assuming that the cumulative distribution function of the experimental data is f (x), and the probability density function is f (x), then:
empirical distribution function introducing cumulative distribution function:
using x in n observationsiThe ratio of the number of occurrences of t ≦ to n to approximately describe P (x ≦ t), and the function is substituted for f (x ≦ t)i) Is obtained by
h is also called the bandwidth in the kernel density estimation, the value of h cannot be too large or too small, the condition of h → 0 cannot be satisfied if the value of h is too large, too small a sample data point is used too little, and the error can be large.
Bandwidth selection is based on the formula:
wherein, σ is the standard deviation of the sample, n is the number of the samples, and through experimental analysis, the bandwidth h =5 adopted in the experiment.
After determining the bandwidth, we can write the expression of f (x):
based on the radius, kinetic energy and amplitude of the vortex as main input quantities, the distribution situation of the depth where the underwater temperature abnormal extreme value is located can be predicted by using a probability density function of kernel density estimation, meanwhile, the statistical probability distribution of the depth can be obtained, and the final statistical result is shown as a global depth distribution diagram in fig. 7 and a global depth probability distribution diagram in fig. 8. As can be seen from the figure, the average depth of the extreme value of the global temperature anomaly is about 260m, and the depth of the middle part of the Pacific ocean and the North Atlantic ocean can reach 400 m; the probability accuracy mean value of the depth of the temperature abnormal extreme value predicted by the probability density in the global layer is 0.8, the accuracy meets the requirement, the accuracy can reach 1 in the northern part of the Indian ocean and the eastern part of the Pacific ocean, and the probability density function estimated by the kernel density can predict the distribution condition of the depth of the underwater temperature abnormal extreme value.
As shown in the example, after parameter setting is completed, the underwater temperature abnormal extreme value of vortex and the information of the depth of the underwater temperature abnormal extreme value can be inverted innovatively according to the vortex sea surface layer signal, accuracy test is conducted on the inversion result and actual data, inversion and accuracy statistics of key parameters of global vortex underwater temperature abnormality can be completed, and the underwater vortex underwater temperature abnormal extreme value calculation method has a reference value for underwater research of ocean mesoscale vortex.
Claims (9)
1. A vortex sea surface signal and underwater key parameter inversion method based on random forests is characterized by comprising the following steps:
(1) intersection is solved for Argo buoy data and SLA-based vortex identification and tracking data sets;
(2) counting the correlation between the main parameters of the vortex identification data in the intersection and the underwater temperature abnormal data to obtain correlation parameters;
(3) predicting an abnormal extreme value of the underwater temperature based on a random forest algorithm and in combination with the correlation parameters obtained in the step (1);
(4) and carrying out statistical analysis on the depth of the abnormal extreme value of the underwater temperature based on the probability density function, and obtaining probability distribution data of the underwater depth.
2. The inversion method of claim 1, wherein the Argo buoy data collection in step (1) takes more than 20 years.
3. The inversion method of claim 1, wherein the intersection in step (1) is specifically:
1-1: solving the intersection point of the extension line of the connecting line of the buoy and the vortex center and the vortex boundary;
1-2: calculating the distance between the intersection point and the vortex center;
1-3: extracting an Argo buoy located inside a vortex
The total distance between the intersection point and the vortex center isL1The total distance between the buoy and the vortex center isL2,If, ifAnd judging that the Argo buoy is positioned in the vortex, and traversing Argo buoy data by the method to obtain an intersection data set of Argo temperature abnormal data and SLA-based vortex identification data.
4. The inversion method of claim 1, wherein in step (2), the correlation parameters include radius, amplitude, and kinetic energy of vortex sea surface signals.
5. The inversion method of claim 4, wherein the step (2) is specifically:
2-1: first, a filter function is introduced toThe initial conditions of (2) will also eliminate data transients;it is operated under a least-squares framework,
wherein the content of the first and second substances, Y f b andY bf is an initial value;
the initial parameter settings were as follows:
2-2: recall kernel density estimation function,The gaussian kernel density estimation formula is as follows:
where h is the bandwidth; and (3) processing the temperature anomaly and the depth data through Gaussian kernel density to obtain a two-dimensional array table, returning an index value of the two-dimensional array value from small to large to obtain a temperature anomaly extreme value, depth and index values x, y and z thereof, and arranging the x and y according to the ascending order of the z to obtain the two-dimensional index table of the global vortex sea surface signal and the underwater temperature anomaly data.
6. The inversion method of claim 1, wherein in step (3): and (3) importing the two-dimensional index table file of the vortex and underwater characteristic information obtained in the step (2) into a random forest algorithm, and predicting an underwater temperature abnormal extreme value by using the radius, kinetic energy and amplitude of the vortex as main input quantities and information such as the distance between a buoy and a vortex center, the longitude and the latitude as auxiliary input quantities.
7. The inversion method of claim 6, wherein the step (3) is specifically:
3-1: firstly, grid searching, namely grid searching and cross validation, is used for a two-dimensional index table file; adopting random forest prediction to the training data set to obtain a prediction result of the abnormal extreme value of the temperature;
the training set partition formula is as follows:
the data is a normal distribution data set with the mean value of 0 and the variance of 1, and the variable number m is a positive integer;
normal distribution data set from average setAfter iteration, ρ is from [0, 1%]Selected from the uniform distribution of (a);
β m is a subset of the number of the bits in the bit stream,kis the clustering center, h is the grouping coefficient;
3-2: fitting the training data set, and then performing standard normalization processing; the standard normalization uses Z-score normalization, which is based on the idea of normalizing all data into a distribution with a mean of 0 and a variance of 1, and the formula is as follows:
wherein the content of the first and second substances,Xmeanis the mean of the features;σ: standard deviation of each set of eigenvalues;X: each of the characteristic values;Xscale: normalizing the characteristic value;
3-3: random forest prediction is carried out on the training data set to obtain an inversion result of the abnormal extreme value of the underwater temperature based on the vortex sea surface signal, and the random forest algorithm formula is as follows:
wherein the content of the first and second substances,is a predictor, x is an input value, and y is an output value
The distribution of the input value x and the predicted value y can be obtained by integrating the two sides of the formula, and the mean square error can be obtained at the same timeMean square errorAnd obtaining a final temperature anomaly predicted value y through processing of a random forest prediction algorithm.
8. The inversion method of claim 1, wherein the step (4) is to predict the depth of the extreme value of the underwater temperature anomaly by using a probability density function of kernel density estimation based on the vortex radius, the vortex kinetic energy and the vortex amplitude in the intersection data set of the Argo temperature anomaly data obtained in the step (2) and the SLA-based vortex identification data and the depth information of the temperature anomaly.
9. The inversion method of claim 8, wherein in step (4), if the cumulative distribution function is f (x) and the probability density function is f (x), then:
empirical distribution function introducing cumulative distribution function:
using x in n observationsiThe ratio of the number of occurrences of t ≦ to n to approximately describe P (x ≦ t), and the function is substituted for f (x ≦ t)i) Is obtained by
h is also referred to as bandwidth in kernel density estimation;
bandwidth selection is based on the formula:
wherein, sigma is the standard deviation of the samples, and n is the number of the samples;
after determining the bandwidth, writing an expression of f (x):
based on the radius, kinetic energy and amplitude of the vortex as main input quantities, the distribution condition of the depth of the underwater temperature abnormal extreme value can be predicted by using a probability density function of kernel density estimation, and meanwhile, the statistical probability distribution of the depth is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111382855.1A CN113821986B (en) | 2021-11-22 | 2021-11-22 | Vortex sea surface signal and underwater key parameter inversion method based on random forest |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111382855.1A CN113821986B (en) | 2021-11-22 | 2021-11-22 | Vortex sea surface signal and underwater key parameter inversion method based on random forest |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113821986A true CN113821986A (en) | 2021-12-21 |
CN113821986B CN113821986B (en) | 2022-02-22 |
Family
ID=78917976
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111382855.1A Active CN113821986B (en) | 2021-11-22 | 2021-11-22 | Vortex sea surface signal and underwater key parameter inversion method based on random forest |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113821986B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114494894A (en) * | 2022-04-18 | 2022-05-13 | 中国海洋大学 | Ocean black vortex automatic identification and key parameter inversion method and device and electronic equipment |
CN115291615A (en) * | 2022-10-10 | 2022-11-04 | 中国海洋大学 | Self-adaptive vortex tracking observation system and control method and device thereof |
CN115797734A (en) * | 2023-02-07 | 2023-03-14 | 慧铁科技有限公司 | Method for representing and processing discrete data of railway train fault form |
CN116151136A (en) * | 2023-04-24 | 2023-05-23 | 浙江大学 | Global surface sea water pH inversion method and system based on probability error compensation |
CN116306318A (en) * | 2023-05-12 | 2023-06-23 | 青岛哈尔滨工程大学创新发展中心 | Three-dimensional ocean thermal salt field forecasting method, system and equipment based on deep learning |
CN116629026A (en) * | 2023-07-18 | 2023-08-22 | 中国海洋大学 | BP neural network-based vortex nuclear underwater maximum temperature anomaly inversion method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101513591B1 (en) * | 2014-08-26 | 2015-04-21 | (주)비엔티솔루션 | System for providing real time ocean spatial data using web 3d |
CN105894439A (en) * | 2016-04-05 | 2016-08-24 | 中国海洋大学 | Ocean eddy and Argo buoy intersection data rapid extraction algorithm based on CUDA |
CN107784667A (en) * | 2016-08-24 | 2018-03-09 | 中国海洋大学 | Based on parallel global ocean mesoscale eddy Fast Recognition Algorithm |
CN109543356A (en) * | 2019-01-07 | 2019-03-29 | 福州大学 | Consider the ocean interior temperature-salinity structure remote sensing inversion method of Space atmosphere |
CN111242206A (en) * | 2020-01-08 | 2020-06-05 | 吉林大学 | High-resolution ocean water temperature calculation method based on hierarchical clustering and random forests |
CN112883564A (en) * | 2021-02-01 | 2021-06-01 | 中国海洋大学 | Water body temperature prediction method and prediction system based on random forest |
-
2021
- 2021-11-22 CN CN202111382855.1A patent/CN113821986B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101513591B1 (en) * | 2014-08-26 | 2015-04-21 | (주)비엔티솔루션 | System for providing real time ocean spatial data using web 3d |
CN105894439A (en) * | 2016-04-05 | 2016-08-24 | 中国海洋大学 | Ocean eddy and Argo buoy intersection data rapid extraction algorithm based on CUDA |
CN107784667A (en) * | 2016-08-24 | 2018-03-09 | 中国海洋大学 | Based on parallel global ocean mesoscale eddy Fast Recognition Algorithm |
CN109543356A (en) * | 2019-01-07 | 2019-03-29 | 福州大学 | Consider the ocean interior temperature-salinity structure remote sensing inversion method of Space atmosphere |
CN111242206A (en) * | 2020-01-08 | 2020-06-05 | 吉林大学 | High-resolution ocean water temperature calculation method based on hierarchical clustering and random forests |
CN112883564A (en) * | 2021-02-01 | 2021-06-01 | 中国海洋大学 | Water body temperature prediction method and prediction system based on random forest |
Non-Patent Citations (5)
Title |
---|
HE HONGLIN: "Uncertainty analysis of eddy flux measurements in typical ecosystems of ChinaFlux", 《ECOLOGICAL INFORMATICS》 * |
刘炜: "基于随机森林算法的吴堡站测流断面形态预测", 《人民黄河》 * |
刘长东: "海洋多源数据获取及基于多源数据的海域管理信息系统", 《中国博士学位论文全文数据库》 * |
孙春健: "卫星遥感重构海洋次表层研究进展", 《海洋信息》 * |
马纯永: "Altimeter Observation-Based Eddy Nowcasting Using an Improved Conv-LSTM Network", 《REMOTE SENSING》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114494894A (en) * | 2022-04-18 | 2022-05-13 | 中国海洋大学 | Ocean black vortex automatic identification and key parameter inversion method and device and electronic equipment |
CN115291615A (en) * | 2022-10-10 | 2022-11-04 | 中国海洋大学 | Self-adaptive vortex tracking observation system and control method and device thereof |
CN115291615B (en) * | 2022-10-10 | 2023-02-28 | 中国海洋大学 | Self-adaptive vortex tracking observation system and control method and device thereof |
CN115797734A (en) * | 2023-02-07 | 2023-03-14 | 慧铁科技有限公司 | Method for representing and processing discrete data of railway train fault form |
CN116151136A (en) * | 2023-04-24 | 2023-05-23 | 浙江大学 | Global surface sea water pH inversion method and system based on probability error compensation |
CN116151136B (en) * | 2023-04-24 | 2023-06-27 | 浙江大学 | Global surface sea water pH inversion method and system based on probability error compensation |
CN116306318A (en) * | 2023-05-12 | 2023-06-23 | 青岛哈尔滨工程大学创新发展中心 | Three-dimensional ocean thermal salt field forecasting method, system and equipment based on deep learning |
CN116629026A (en) * | 2023-07-18 | 2023-08-22 | 中国海洋大学 | BP neural network-based vortex nuclear underwater maximum temperature anomaly inversion method |
CN116629026B (en) * | 2023-07-18 | 2023-09-26 | 中国海洋大学 | BP neural network-based vortex nuclear underwater maximum temperature anomaly inversion method |
Also Published As
Publication number | Publication date |
---|---|
CN113821986B (en) | 2022-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113821986B (en) | Vortex sea surface signal and underwater key parameter inversion method based on random forest | |
CN109541172B (en) | Soil attribute value calculation method and device | |
Reale et al. | A global climatology of explosive cyclones using a multi-tracking approach | |
CN104807589B (en) | A kind of ONLINE RECOGNITION method collecting flow pattern of gas-liquid two-phase flow in defeated-riser systems | |
CN103336305B (en) | A kind of method dividing Sandstone Gas Reservoir high water cut based on gray theory | |
CN107784165B (en) | Surface temperature field multi-scale data assimilation method based on photovoltaic power station | |
KR102479804B1 (en) | Method, device and program for measuring water level, volume, inflow and pollution level using an artificial intelligence model that reflects regional and seasonal characteristics | |
Dikbas et al. | Defining homogeneous regions for streamflow processes in Turkey using a K-means clustering method | |
CN111695473A (en) | Tropical cyclone strength objective monitoring method based on long-time and short-time memory network model | |
CN112687356B (en) | Method and device for distinguishing organic carbon vertical distribution model, terminal and storage medium | |
CN112907113B (en) | Vegetation change cause identification method considering spatial correlation | |
CN116796799A (en) | Method for creating small-river basin flood rainfall threshold model in area without hydrologic data | |
CN114492540B (en) | Training method and device of target detection model, computer equipment and storage medium | |
CN114021445B (en) | Ocean vortex mixing non-locality prediction method based on random forest model | |
Song et al. | Hyperspectral data spectrum and texture band selection based on the subspace-rough set method | |
Ou et al. | Estimation of sound speed profiles using a random forest model with satellite surface observations | |
CN114563771A (en) | Double-threshold laser radar cloud layer detection algorithm based on cluster analysis | |
CN114528729A (en) | Method for predicting yield of buried hill fracture gas reservoir based on multi-scale coupling | |
Shi et al. | Application and comparing of IDW and Kriging interpolation in spatial rainfall information | |
Hinrichs et al. | The Baltic and North Seas Climatology (BNSC)—A Comprehensive, Observation-Based Data Product of Atmospheric and Hydrographic Parameters | |
Oteng Mensah et al. | Modeling monthly actual evapotranspiration: an application of geographically weighted regression technique in the Passaic River Basin | |
Wang et al. | Bayesian networks precipitation model based on hidden Markov analysis and its application | |
CN114755387B (en) | Water body monitoring point location optimization method based on hypothesis testing method | |
Kislov et al. | Extreme Values of Wind Speed over the Kara Sea Based on the ERA5 Dataset | |
Zheng et al. | A Hybrid Approach for Soil Total Nitrogen Anomaly Detection Integrating Machine Learning and Spatial Statistics. Agronomy 2023, 13, 2669 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |