CN116881721B - Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm - Google Patents

Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm Download PDF

Info

Publication number
CN116881721B
CN116881721B CN202310918757.8A CN202310918757A CN116881721B CN 116881721 B CN116881721 B CN 116881721B CN 202310918757 A CN202310918757 A CN 202310918757A CN 116881721 B CN116881721 B CN 116881721B
Authority
CN
China
Prior art keywords
data
vegetation
cygnss
gnss
optical thickness
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310918757.8A
Other languages
Chinese (zh)
Other versions
CN116881721A (en
Inventor
布金伟
张永凤
左小清
朱大明
李勇发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN202310918757.8A priority Critical patent/CN116881721B/en
Publication of CN116881721A publication Critical patent/CN116881721A/en
Application granted granted Critical
Publication of CN116881721B publication Critical patent/CN116881721B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • G06F18/15Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/188Vegetation

Abstract

The invention discloses a method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm, which comprises the following steps: acquiring satellite-borne GNSS-R, SMAP and MODIS data of the CYGNSS satellites respectively; preprocessing and quality control of all acquired data sets; data space-time matching and data division; extracting CYGNSS reflection point data on vegetation by using MODIS land coverage type data; removing CYGNSS emission point data on the land open water area by using MODIS land open water area data; calculating the emissivity of the CYGNSS earth surface; constructing a satellite-borne GNSS-R vegetation optical thickness inversion model; and inputting the test data set into the trained ET model to obtain an inversion vegetation optical thickness value, and evaluating the result. The invention utilizes the integrated machine learning algorithm to fuse the bistatic radar scattering section, the effective scattering area, the CYGNSS variable parameter and the earth surface auxiliary parameter to construct an extremely random tree integrated machine learning model, and can realize the vegetation optical thickness inversion with high precision and high space-time resolution in the global land range.

Description

Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm
Technical Field
The invention belongs to the technical field of GNSS vegetation optical thickness inversion, and particularly relates to a method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm.
Background
The vegetation optical thickness (Vegetation Optical Depth, VOD) is an important parameter for characterizing vegetation climate change in the microwave remote sensing inversion process, and is used as a vegetation index sensitive to vegetation moisture and biomass content based on microwaves, and has been increasingly applied to study the influence of global climate and environmental change on vegetation. Since the optical thickness of vegetation is sensitive to and positively correlated with the growth variation of vegetation, the trend of the optical thickness of vegetation is commonly used in practical application to reflect the growth condition of vegetation. Therefore, a vegetation optical thickness inversion algorithm with higher feasibility is developed, and the vegetation optical thickness inversion algorithm has important practical significance for agricultural condition monitoring, vegetation biomass estimation, soil moisture inversion and the like. Remote sensing is used as a comprehensive, macroscopic, rapid and real-time observation means, and is one of effective tools for monitoring surface parameters. However, conventional remote sensing techniques are susceptible to weather conditions (e.g., areas with high cloud coverage) and to surface disturbances (e.g., soil conditions, terrain, etc.). Thereby affecting the accuracy of the vegetation optical thickness inversion. The global navigation satellite system reflection remote sensing (Global Navigation Satellite System-Reflectometry, GNSS-R) technology is an emerging technology which is rapidly developed in the 90 th century and detects earth surface information based on reflection characteristics of GNSS signals, the receiver carried on the satellite is used for receiving direct signals directly from the GNSS satellites and echo signals reflected by a reflection surface, a delay Doppler diagram is obtained through processing, corresponding to two-dimensional correlation power of the reflection signals, and physical parameter information of an earth surface scattering surface can be obtained through a certain inversion method, so that the technology is one of research hotspots in the field of remote sensing detection at home and abroad at present. Besides the remarkable advantages of rich signal sources, low cost, high space-time resolution and the like, the satellite-borne GNSS-R has the advantages that the L-band signal sources carried by the GNSS-R not only can easily penetrate through cloud layers and the atmosphere, but also can realize all-day and all-weather remote sensing monitoring. Therefore, the GNSS-R technology is of great significance in realizing all-weather, all-day and wide-coverage vegetation optical thickness inversion.
At present, the research on inversion of vegetation optical thickness mainly based on a soil moisture algorithm of microwave remote sensing at home and abroad mainly comprises four types: (1) SMOS inversion algorithm. The algorithm obtains the vegetation optical thickness and soil moisture by using a multi-angle and dual-polarization method to obtain a new SMOS-IC product. (2) single-channel algorithm of SMAP. The algorithm can only invert soil moisture, and the inversion process depends on vegetation optical thickness, surface roughness, surface temperature and other auxiliary information. (3) The UMT algorithm of the AMSR2 adopts brightness data of a single frequency band (18.7 Ghz), and performs inversion of vegetation optical thickness by using the calculated surface temperature, atmospheric water vapor and other auxiliary information in the four-channel model. However, most of researches on GNSS-R reflection remote sensing are focused on inversion researches on soil humidity at present, and few researches on establishing an integrated machine learning model for inverting vegetation optical thickness by fusing CYGNSS, SMAP and MODIS data are performed.
The present invention has been made in view of this.
Disclosure of Invention
The technical problem to be solved by the invention is to overcome the defects of the prior art, and provide a method for inverting vegetation optical thickness (Vegetation Optical Depth, VOD) by combining satellite-borne GNSS-R data and an integrated machine learning algorithm.
In order to solve the technical problems, the invention adopts the basic conception of the technical scheme that:
a method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm comprises the following steps:
step S1, acquiring satellite-borne GNSS-R, earth surface auxiliary parameters SMAP and MODIS data of a CYGNSS satellite respectively;
step S2, preprocessing and quality controlling all acquired data sets;
step S3, data space-time matching and data division;
s4, extracting CYGNSS reflection point data on vegetation by using MODIS land and coverage type data;
s5, removing CYGNSS reflection point data on the land open water area by using MODIS land open water area data;
s6, calculating the CYGNSS earth surface reflectivity;
s7, constructing a satellite-borne GNSS-R vegetation optical thickness inversion model by using reflected signal images, observation variables and earth surface auxiliary variable parameters observed by the CYGNSS satellites;
and S8, inputting the test data set into a trained ET model to obtain an inversion vegetation optical thickness value, and comparing and evaluating the result with a Decision Tree (DT), a gradient lifting tree (Gradient Boosted Decision Trees, GBDT), an adaptive enhancement (Adaptive Boosting, adaBoost) and a support vector machine (Support Vector Machines, SVM) model.
Preferably, the CYGNSS satellite-borne GNSS-R used therein comprises quality identification (quality_flag), power DDM (power_analog), GNSS-R receiver to specular reflection point distance (rx_to_sp_range), GNSS transmitter to specular reflection point distance (tx_to_sp_range), GNSS-R receiver's receiving antenna gain (sp_rx_gain), specular reflection point latitude (sp_lat), specular reflection point longitude (sp_lon), specular reflection point entry angle (sp_inc_angle), GPS transmitter's equivalent isotropic radiation power (gps_eirp), DDM signal-to-noise ratio (DDM _snr), bistatic radar scattering cross section (BRCS), effective scattering area (eff_scanner), and the like; the surface auxiliary parameters include six parameter data of soil humidity (soil_movement), surface temperature (surface_temperature), vegetation water content (vegetation_water_content), roughness coefficient (roughness_coeffective), vegetation optical thickness (vegetation_availability_dca), and quality identification (real_quality_flag). The MODIS data comprises 16 land utilization coverage types, and specifically comprises the following steps: 9 vegetation cover types such as evergreen conifer, evergreen broadleaf, deciduous conifer, mixed forest, closing shrub, multi-tree grassland, sparse shrub, sparse grassland and the like; water, permanent wetland, cultivated land, city and building, farmland/vegetation hybrid land, permanent ice and snow, and 7 kinds of non-vegetation land coverage types.
Preferably, the quality control of the data set in step S2 includes:
firstly, deleting all observed values containing NaN values;
secondly, all observations less than 0 need to be discarded;
third, the RCG value should be greater than 3;
fourth, if the gain (sp_rx_gain) of the receiving antenna in the reflection point direction is less than 0dBi, discarding is required;
fifth, if the uncertainty of BRCS (ddm _brcs_uncert) is greater than 1, it also needs to be discarded;
sixth, the sampling point of the incident angle needs to be greater than 65 °;
seventh, the signal-to-noise ratio discards sampling points less than or equal to 0 dB; in order to ensure that errors of specular reflection points caused by terrain are within a reasonable range, sampling points of delay lines beyond [7, 10] where DDM peak power is located are removed; in addition, sampling points with low precision are removed according to quality marks (quality_flags) provided in the data. The variable quality_flags gives a total of 31 quality flags, such as: "Overall poor quality data", "whether the S-band transmitter is powered on", "angular range of roll or pitch or yaw of the spacecraft", "whether the DDM sampling period reconfigures the DDMI", "DDM CRC for DDMI transmission to the spacecraft computer is invalid", "whether the DDM is a test pattern for DDMI generation", "whether the reflector channel tracks PRN", "whether the point is on land", and so forth. The 31 quality marks are stored in a variable, the setting which accords with the condition that the quality marks belong to is 1, the setting which does not accord with the condition is 0, 31-bit binary numbers are generated according to a certain sequence, and the binary numbers are converted into decimal numbers and stored in the variable. The data filtering is applied using a single QC bit during the data filtering process, rather than the entire QC bit ("poor overall quality", least significant bit or bit 0). The quality control mark sp_over_land (bit 11) is reserved to reject marine data to obtain land data, and otherwise, DDM invalid or abnormal data (bits 4, 7, 8, 9, 10, 15, 17 and 18) are deleted; deleting data (bits 5, 6, 23, 26) of other instruments that have data transmission and calibration problems; deleting spacecraft data (bit 2) with larger attitude errors or abnormal attitude; the coding errors measured by the variable fsw_comp_delay_shift and the variable fsw_comp_dopp_shift flight software are deleted (bit 27).
Furthermore, step S3 comprises the following sub-steps,
and S3.1, extending outwards to form a grid of 10km multiplied by 10km by taking the longitude and latitude of a CYGNSS network center of 36km as a standard, respectively interpolating soil humidity, surface temperature, roughness coefficient, vegetation optical thickness and vegetation water content data by adopting a nearest neighbor method, and selecting specular reflection points reserved in the grid to finally obtain space-matched CYGNSS satellite-borne GNSS-R, SMAP data.
And S3.2, the network center longitude and latitude of the 36km CYGNSS is used as a standard to extend outwards to form a 6km multiplied by 6km grid, the MODIS land utilization coverage type in the grid is reserved by adopting a nearest neighbor method, and the nearest is selected as an optimal value.
Step S3.3, randomly selecting and dividing the data set after spatial matching into a training set, a verification set and a test set, wherein the training set, the verification set and the test set respectively account for 60%, 10% and 30% of the data set after filtering.
And S4, extracting CYGNSS reflection point data on vegetation, reserving the reflection point data on the vegetation, and eliminating the reflection point data on non-vegetation type data.
Preferably, the land water body near the CYGNSS reflection point is removed by using the MODIS land water area data in step S5 to eliminate the interference of the water body on the reflected signal.
Preferably, the calculation of the surface reflectivity in step S6 is calculated using the following formula:
p in the formula r Peak power for DDM; p (P) t G t An equivalent omni-directional radiated power for the transmitter; p (P) t The transmitting power of the GNSS satellite right-hand circularly polarized navigation signal is set; g t Gain for the transmit antenna; g r Gain for the receiver antenna; lambda is the carrier wavelength of the transmitted signal; r is R r And R is t The distance from the transmitter to the surface emission point and the distance from the receiver to the surface emission point are respectively determined. The relevant parameters used in the calibration equation can be obtained in the CYGNSS product file, wherein P t G t Using the equivalent isotropic radiated power (Effective Isotropic Radiated Power, EIRP) provided in the CYGNSS product as a calculation result, N is DDM noise floor, and the power average calculation formula defined as the specified noise region is as follows:
wherein τ and τ i Delay bin boundaries for specified noise regions; f (f) 1 And f i Doppler frequency boundaries for designated noise regions; m is the number of pixels in the noise region; DDM (τ, f) is a DDM power value at a specified position.
Normalizing the filtered data set into zero mean and unit variance according to the characteristics, preprocessing all the acquired data sets, and dividing the obtained filtered data set into a training set, a verification set and a test set;
wherein the training set is used to train the model and the test set is used to evaluate the performance of the model. The training set is trained by adopting an ET model. The ET model will randomly draw a threshold for each candidate feature by an extreme stochastic tree regressor and select the best of these randomly generated thresholds as the segmentation rule. And inputting SMAP VOD data by using the trained ET model so as to save the optimal ET model for inversion of VOD.
After training is completed, the test data set is input into the trained ET model to obtain inverted vegetation optical thickness values, and the results are evaluated.
According to the invention, the ET model based on the integrated machine learning algorithm is adopted to realize inversion of the global vegetation optical thickness, and super-parameter setting is required when the ET model is used so as to find an optimal model for VOD inversion. The parameters that need to be adjusted when using the model are mainly n_optimators and max_features. n_evators represents the number of trees in the forest, and generally the larger the number, the better the effect. However, when the number of trees exceeds a critical value, the effect of the algorithm is not optimal, so the value of n_evastiators is set to 10 through the experiments of the present invention in order to achieve the optimal effect. And max_features is the size of the random subset of features considered when partitioning the nodes. The lower the value, the more the variance decreases. The max_features=none, max_dept=none, min_samples_split=2, boottrap=false and oob _score=true are set in the extratreesregress regression problem, and the optimal parameter values are obtained through cross-validation. The inversion accuracy of the optimal ET model obtained after training on VOD is higher.
By adopting the technical scheme, compared with the prior art, the invention has the following beneficial effects.
The invention utilizes an integrated machine learning algorithm to fuse a bistatic radar scattering cross section (Bistatic Radar Cross Section, BRCS), an effective scattering area (Effective Scattering Area), a CYGNSS variable parameter and an earth surface auxiliary parameter to construct an extreme random tree (Extremely randomized trees, ET) integrated machine learning model, and the ET model randomly draws a threshold value for each candidate feature through an extreme random tree regressor, and selects the best of the randomly generated threshold values as a segmentation rule. By adopting the technical scheme of the invention, the inversion of the vegetation optical thickness with high precision and high space-time resolution in the global land range can be realized.
The following describes the embodiments of the present invention in further detail with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention, without limitation to the invention. It is evident that the drawings in the following description are only examples, from which other drawings can be obtained by a person skilled in the art without the inventive effort. In the drawings:
FIG. 1 is a flow chart of the method of the present invention.
Fig. 2 is a block diagram of an ET model of the present invention.
FIG. 3 is a graph of VOD scatter density for five model inversions in an embodiment of the present invention.
FIG. 4 is a graph comparing inversion VOD and SMAP VOD of different machine learning models in an embodiment of the present invention.
FIG. 5 is a histogram of the distribution of the deviations between the five model inversion VOD and SMAP VOD in an embodiment of the invention.
It should be noted that these drawings and the written description are not intended to limit the scope of the inventive concept in any way, but to illustrate the inventive concept to those skilled in the art by referring to the specific embodiments.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions in the embodiments will be clearly and completely described with reference to the accompanying drawings in the embodiments of the present invention, and the following embodiments are used to illustrate the present invention, but are not intended to limit the scope of the present invention.
Example 1
To verify the advantages of the proposed method, quality identification (quality_flag), DDM (power_analog), GNSS-R receiver to specular reflection point distance (rx_to_sp_range), GNSS transmitter to specular reflection point distance (tx_to_sp_range), GNSS-R receiver's receiving antenna gain (sp_rx_gain), specular reflection point latitude (sp_lat), specular reflection point longitude (sp_lon), specular reflection point entry angle (sp_inc_angle), GPS transmitter's equivalent isotropic radiated power (gps_eirp), DDM signal-to-noise ratio (DDM _sr), bistatic radar scattering cross section (BRCS), effective scattering area (eff_scattering), etc. among the CYGNSS satellite-borne GNSS-R data are obtained; the surface auxiliary parameters include six parameter data of soil humidity (soil_movement), surface temperature (surface_temperature), vegetation water content (vegetation_water_content), roughness coefficient (roughness_coeffective), vegetation optical thickness (vegetation_availability_dca), and quality identification (real_quality_flag). The MODIS data comprises 16 land utilization coverage types, and specifically comprises the following steps: 9 vegetation cover types such as evergreen conifer, evergreen broadleaf, deciduous conifer, mixed forest, closing shrub, multi-tree grassland, sparse shrub, sparse grassland and the like; water, permanent wetland, cultivated land, city and building, farmland/vegetation hybrid land, permanent ice and snow, and 7 kinds of non-vegetation land coverage types. And comparing the experimental result of the invention with the vegetation optical thickness result inverted by the DT, GBDT, adaBoost, SVM model. The basic configuration of the experimental platform of this embodiment is shown in table 1:
TABLE 1 configuration of experiment platform
A method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm is shown in the accompanying figure 1, and the implementation flow of the technical scheme comprises the following steps:
step S1, acquiring satellite-borne GNSS-R, SMAP and MODIS data of a CYGNSS satellite respectively;
step S2, preprocessing and quality controlling all acquired data sets;
step S3, data space-time matching and data division;
s4, extracting CYGNSS reflection point data on vegetation by using MODIS land and coverage type data;
s5, removing CYGNSS reflection point data on the land open water area by using MODIS land open water area data;
s6, calculating the CYGNSS earth surface reflectivity;
s7, constructing a satellite-borne GNSS-R vegetation optical thickness inversion model by using reflected signal images, observation variables and earth surface auxiliary variable parameters observed by the CYGNSS satellites;
step S8, inputting the test data set into the trained ET model to obtain inversion vegetation optical thickness values, and comparing and evaluating the results with Decision Trees (DT), gradient lifting Trees (Gradient Boosted Decision Trees, GBDT), adaptive enhancement (Adaptive Boosting, adaBoost) and support vector machine (Support Vector Machines, SVM) models.
As an implementation manner of this embodiment, in step S1, the above-mentioned CYGNSS satellite-borne GNSS-R includes quality identifier (quality_flag), DDM (power_analog), GNSS-R receiver to specular reflection point distance (rx_to_sp_range), GPS transmitter to specular reflection point distance (tx_to_sp_range), receiving antenna gain (sp_rx_gain) of GNSS-R receiver, specular reflection point latitude (sp_lat), specular reflection point longitude (sp_lon), specular reflection point entry angle (sp_inc_angle), equivalent isotropic radiation power (GPS transmitter (gps_eirp), DDM signal-to-noise ratio (DDM _snr), bistatic radar scattering cross section (BRCS), effective scattering area (eff_scattering), and so on; the surface auxiliary parameters include six parameter data of soil humidity (soil_movement), surface temperature (surface_temperature), vegetation water content (vegetation_water_content), roughness coefficient (roughness_coeffective), vegetation optical thickness (vegetation_availability_dca), and quality identification (real_quality_flag). The MODIS data comprises 16 land utilization coverage types, and specifically comprises the following steps: 9 vegetation cover types such as evergreen conifer, evergreen broadleaf, deciduous conifer, mixed forest, closing shrub, multi-tree grassland, sparse shrub, sparse grassland and the like; water, permanent wetland, cultivated land, city and building, farmland/vegetation hybrid land, permanent ice and snow, and 7 kinds of non-vegetation land coverage types.
As one implementation manner of this embodiment, preferably, the data set quality control in step S2 includes: firstly, deleting all observed values containing NaN values; secondly, all observations less than 0 need to be discarded; third, the RCG value should be greater than 3; fourth, if the gain (sp_rx_gain) of the receiving antenna in the reflection point direction is less than 0dBi, discarding is required; fifth, if the uncertainty of BRCS (ddm _brcs_uncert) is greater than 1, it also needs to be discarded, and sixth, the sampling point of the incident angle needs to be greater than 65 °; seventh, the signal-to-noise ratio discards sampling points less than or equal to 0 dB; in order to ensure that errors of specular reflection points caused by terrain are within a reasonable range, sampling points of delay lines beyond [7, 10] where DDM peak power is located are removed; in addition, sampling points with low precision are removed according to quality marks (quality_flags) provided in the data. The variable quality_flags gives a total of 31 quality flags, such as: "Overall poor quality data", "whether the S-band transmitter is powered on", "angular range of roll or pitch or yaw of the spacecraft", "whether the DDM sampling period reconfigures the DDMI", "DDM CRC for DDMI transmission to the spacecraft computer is invalid", "whether the DDM is a test pattern for DDMI generation", "whether the reflector channel tracks PRN", "whether the point is on land", and so forth. The 31 quality marks are stored in a variable, the setting which accords with the condition that the quality marks belong to is 1, the setting which does not accord with the condition is 0, 31-bit binary numbers are generated according to a certain sequence, and the binary numbers are converted into decimal numbers and stored in the variable. The data filtering is applied using a single QC bit during the data filtering process, rather than the entire QC bit ("poor overall quality", least significant bit or bit 0). The quality control mark sp_over_land (bit 11) is reserved to reject marine data to obtain land data, and otherwise, DDM invalid or abnormal data (bits 4, 7, 8, 9, 10, 15, 17 and 18) are deleted; deleting data (bits 5, 6, 23, 26) of other instruments that have data transmission and calibration problems; deleting spacecraft data (bit 2) with larger attitude errors or abnormal attitude; the coding errors measured by the variable fsw_comp_delay_shift and the variable fsw_comp_dopp_shift flight software are deleted (bit 27).
As an implementation of this embodiment, step S3 comprises the following sub-steps,
and S3.1, extending outwards to form a grid of 10km multiplied by 10km by taking the longitude and latitude of a CYGNSS network center of 36km as a standard, respectively interpolating soil humidity, surface temperature, roughness coefficient, vegetation optical thickness and vegetation water content data by adopting a nearest neighbor method, and selecting specular reflection points reserved in the grid to finally obtain space-matched CYGNSS satellite-borne GNSS-R, SMAP data.
And S3.2, the network center longitude and latitude of the 36km CYGNSS is used as a standard to extend outwards to form a 6km multiplied by 6km grid, the MODIS land utilization coverage type in the grid is reserved by adopting a nearest neighbor method, and the nearest is selected as an optimal value.
Step S3.3, randomly selecting and dividing the data set after spatial matching into a training set, a verification set and a test set, wherein the training set, the verification set and the test set respectively account for 60%, 10% and 30% of the data set after filtering.
As an implementation manner of this embodiment, step S4 extracts CYGNSS reflection point data on vegetation by using the MODIS land coverage type data, retains the reflection point data on vegetation, and eliminates the reflection point data on non-vegetation type data.
Preferably, the land water body near the CYGNSS reflection point is removed by using the MODIS land water area data in step S5 to eliminate the interference of the water body on the reflected signal.
As an implementation of the present embodiment, the calculation of the surface reflectivity described in step S6 is used
P in the formula r Peak power for DDM; p (P) t G t An equivalent omni-directional radiated power for the transmitter; p (P) t The transmitting power of the GNSS satellite right-hand circularly polarized navigation signal is set; g t Gain for the transmit antenna; g r Gain for the receiver antenna; lambda is the carrier wavelength of the transmitted signal; r is R r And R is t The distance from the transmitter to the surface emission point and the distance from the receiver to the surface emission point are respectively determined. The relevant parameters used in the calibration equation can be obtained in the CYGNSS product file, wherein P t G t As calculation results, equivalent isotropic radiated power (Effective Isotropic Radiated Power, EIRP) provided in the CYGNSS product was used. N is DDM noise floor, defined as the power average value of the appointed noise area, the calculation formula is as follows:
wherein τ and τ i For specifying delay of noise regionA late bin boundary; f (f) 1 And f i Doppler frequency boundaries for designated noise regions; m is the number of pixels in the noise region; DDM (τ, f) is a DDM power value at a specified position.
The reflected signal image of the CYGNSS satellite observations of step S7 includes BRCS, effective Scattering Area. The observation variables include GNSS-R observation parameters and surface emissivity and surface aiding variable parameters.
As shown in fig. 2, the training process of inverting the global vegetation optical thickness by the ET model method described in step S8 specifically includes:
normalizing the filtered data set into zero mean and unit variance according to the characteristics, preprocessing all the acquired data sets, and dividing the obtained filtered data set into a training set, a verification set and a test set;
wherein the training set is used to train the model and the test data set is used to evaluate the performance of the model. The training set is used for training the ET model. The ET model will randomly draw a threshold for each candidate feature by an extreme stochastic tree regressor and select the best of these randomly generated thresholds as the segmentation rule. And inputting SMAP VOD data for training the ET model, storing the optimal ET model to complete training, inputting the test data set into the trained ET model, obtaining the vegetation optical thickness value and evaluating the result. The Root Mean Square Error (RMSE), pearson Correlation Coefficient (CC), mean Absolute Error (MAE) and Mean Absolute Percent Error (MAPE) are used as indexes for evaluating the inversion performance of the model, and the calculation formulas are as follows:
in the above formula, n is the number of samples, X i And Y i Model inversion VOD and SMAP reference VOD values respectively,and->The inverted VOD average and the reference VOD average, respectively.
The trained ET model is adopted to carry out global VOD inversion, and the experimental result of the invention is compared with the DT, GBDT, adaBoost, SVM model result to evaluate, and the final result is shown in a table 2, and inversion accuracy statistics of different inversion methods are respectively given. From the table the following conclusions can be drawn:
the correlation coefficient of the ET model method is optimal, the performances of the ET model method are obviously superior to those of the AdaBoost, GBDT and SVM model methods, and the precision of the ET model is respectively improved by 178.12%, 67.87% and 85.26% compared with the precision of other three models in the RMSE; the MAE is respectively improved by 87.50%, 80.00% and 92.31%; the MAPE is respectively improved by 87.50 percent, 80.00 percent and 92.31 percent.
Table 2 accuracy of different models to invert vegetation optical thickness on test dataset
In addition, in order to compare the outstanding improvement of the inversion vegetation optical thickness in the correlation between the inversion vegetation optical thickness and the SMAP vegetation optical thickness product in the present embodiment, as shown in fig. 3, it can be seen that the correlation between the ET model inversion VOD and SMAP VOD in the present embodiment is better, and is better than the DT, adaBoost, SVM, GBDT model method, the ET model method has the most points distributed around the line of y=x, and has relatively fewer scattered points, which indicates that the correlation between the ET inversion vegetation optical thickness and SMAP VOD is the best; in contrast, the correlation between the two models AdaBoost, SVM inversion VOD and SMAP VOD is the worst, which clearly shows that more auxiliary parameters are needed to obtain better results among the two models. However, the ET model provided by the embodiment fuses the surface emissivity, the surface auxiliary parameters and the CYGNSS satellite-borne CYGNSS-R variable parameters, so that the robustness, the stability and the universality of the model are improved well.
As shown in fig. 4, the ET model provided in the embodiment obtains higher inversion precision in inverting the global vegetation optical thickness, and is more consistent with SMAP global VOD; adaBoost, DT, ET, SVM, GBDT corresponds to RMSE of 0.098, 0.027, 0.021, 0.067, 0.145, respectively, where the accuracy of the ET model is highest, and the ET model is improved by 78.57%, 22.22%, 68.66%, and 85.52% over AdaBoost, DT, SVM, GBDT, respectively.
The distribution histogram of the deviations between ET model inversion VOD and SMAP VOD is given as shown in fig. 5 (average deviation (μ), standard deviation (σ), average absolute error (MAE), and 80% quantile of the deviation (Qua) are given in the figure, blue bar graph represents the error distribution, red dashed line represents the probability density function fit curve of the error, and green dashed line represents the deviation between ET model inversion and SMAP inversion VOD as 0). As can be seen from the graph, the deviation between ET model inversion VOD and SMAP VOD is very concentrated (80% deviation is less than 0.08) and the ET model inversion VOD and SMAP are mostly distributed around 0; compared with two models, namely AdaBoost and SVM, the extreme random tree integrated machine learning method provided by the invention has remarkable advantages in the aspect of inverting the global vegetation optical thickness.
The foregoing description is only illustrative of the preferred embodiment of the present invention, and is not to be construed as limiting the invention, but is to be construed as limiting the invention to any and all simple modifications, equivalent variations and adaptations of the embodiments described above, which are within the scope of the invention, may be made by those skilled in the art without departing from the scope of the invention.

Claims (4)

1. The method for inverting the vegetation optical thickness by combining the satellite-borne GNSS-R data and the integrated machine learning algorithm is characterized by comprising the following steps of:
step S1, acquiring CYGNSS satellite-borne GNSS-R data, earth surface auxiliary parameter SMAP data and MODIS data respectively; wherein,
the satellite-borne GNSS-R data comprises a quality identifier, a power DDM, a distance from a GNSS-R receiver to a specular reflection point, a distance from a GNSS transmitter to the specular reflection point, a receiving antenna gain of the GNSS-R receiver, a latitude of the specular reflection point, longitude of the specular reflection point, an incident angle of the specular reflection point, an equivalent isotropic radiation power of a GPS transmitter, a DDM signal-to-noise ratio, a scattering cross section of a bistatic radar and an effective scattering area;
the surface auxiliary parameter SMAP data comprise soil humidity, surface temperature, vegetation water content, roughness coefficient, vegetation optical thickness and quality marks of the lifting rail;
the MODIS data includes 9 vegetation field coverage types, 7 non-vegetation field coverage types, and the 9 vegetation field coverage types include: evergreen conifer, evergreen broadleaf, deciduous conifer, hybrid, shrub, multi-tree grassland, sparse shrub, sparse grassland; 7 non-vegetation land cover types including water, permanent wetland, cultivated land, city and building, farmland/vegetation hybrid, permanent ice and snow, and barren land;
s2, preprocessing and quality control are carried out on all acquired data;
step S3, carrying out space-time matching and data division on the data, and specifically comprising the following steps:
step S3.1, the longitude and latitude of the CYGNSS network center of 36km is used as a standard to extend outwards to form a 10km multiplied by 10km grid, the nearest neighbor method is adopted to interpolate the soil humidity, the surface temperature, the roughness coefficient, the vegetation optical thickness and the vegetation water content data respectively, the specular reflection points reserved in the grid are selected, and finally the space-matched CYGNSS satellite-borne GNSS-R data and SMAP data are obtained;
step S3.2, the longitude and latitude of the CYGNSS network center of 36km is used as a standard to extend outwards to form a 6km multiplied by 6km grid, a MODIS land coverage type in the grid is reserved by adopting a nearest neighbor method, and the nearest one is selected as an optimal value;
step S3.3, randomly selecting and dividing the data after spatial matching into a training set, a verification set and a test set, wherein the training set, the verification set and the test set respectively account for 60%, 10% and 30% of the data after filtering;
s4, extracting CYGNSS reflection point data on vegetation by using MODIS land coverage type data, reserving the reflection point data on vegetation land coverage type, and eliminating the reflection point data on non-vegetation land coverage type;
s5, removing CYGNSS reflection point data on the land open water area by using MODIS land open water area data;
s6, calculating the CYGNSS earth surface reflectivity;
s7, constructing a satellite-borne GNSS-R vegetation optical thickness inversion model by using reflected signal images, observation variables and earth surface auxiliary variable parameters observed by the CYGNSS satellites;
s8, inputting a test data set into the trained ET model to obtain a result of inverting the vegetation optical thickness value, and comparing and evaluating the result with a result obtained by inputting the test data set into a decision tree model, a gradient lifting tree model, a self-adaptive enhancement model and a support vector machine model, wherein a root mean square error, a Pearson correlation coefficient, an average absolute error and an average absolute percentage error are adopted as indexes for evaluating inversion performance of the model; the training process of the ET model is specifically realized as follows:
s8.1, taking the divided training set as the input quantity of an ET model;
s8.2, inputting an image, earth surface reflectivity, GNSS-R observation value parameters and earth surface auxiliary variable parameters;
step S8.3, performing the most superparameter setting of the ET model, comprising: n_evators=10, max_features=none, max_dept=none, min_samples_split=2, bootstrapping=false, oob _score=true;
and S8.4, inputting the SMAP VOD into an ET model for training.
2. The method for inverting vegetation optical thickness by combining on-board GNSS-R data and integrated machine learning algorithm according to claim 1, wherein the preprocessing and quality control of all the acquired data in step S2 comprises:
s2.1, deleting all observed values containing NaN values;
s2.2, discarding all observed values smaller than 0;
s2.3, wherein the RCG value is more than 3;
s2.4, if the gain of the receiving antenna in the direction of the reflection point is less than 0dBi, discarding the receiving antenna;
s2.5, if the uncertainty of the BRCS is greater than 1, discarding is also needed;
s2.6, the sampling point of the incident angle needs to be larger than 65 degrees;
s2.7, discarding sampling points with signal-to-noise ratios less than or equal to 0 dB; in order to ensure that errors of specular reflection points caused by terrain are within a reasonable range, sampling points of delay lines where DDM peak power is located are removed beyond [7, 10 ]; in addition, sampling points with low precision are removed according to quality identifications provided in the data.
3. The method for inverting vegetation optical thickness by combining on-board GNSS-R data and integrated machine learning algorithm according to claim 1, wherein the calculation of the surface reflectivity in step S6 is as follows:
wherein is the peak power of DDM;is a DDM noise floor;an equivalent omni-directional radiated power for the transmitter;the transmitting power of the GNSS satellite right-hand circularly polarized navigation signal is set;gain for the transmit antenna;gain for the receiver antenna;a carrier wavelength that is the transmit signal;andthe distance from the transmitter to the surface emission point and the distance from the receiver to the surface emission point are respectively.
4. The method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm according to claim 1, wherein the reflected signal image observed by the CYGNSS satellite in step S7 includes BRCS and an effective scattering area; the observation variables include GNSS-R observation parameters and surface emissivity and surface aiding variable parameters.
CN202310918757.8A 2023-07-25 2023-07-25 Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm Active CN116881721B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310918757.8A CN116881721B (en) 2023-07-25 2023-07-25 Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310918757.8A CN116881721B (en) 2023-07-25 2023-07-25 Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm

Publications (2)

Publication Number Publication Date
CN116881721A CN116881721A (en) 2023-10-13
CN116881721B true CN116881721B (en) 2024-01-02

Family

ID=88269672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310918757.8A Active CN116881721B (en) 2023-07-25 2023-07-25 Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm

Country Status (1)

Country Link
CN (1) CN116881721B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103149220A (en) * 2013-01-30 2013-06-12 中国科学院对地观测与数字地球科学中心 Soil moisture inversion method of mono-frequency microwave radiometer
CN103810387A (en) * 2014-02-13 2014-05-21 中国科学院地理科学与资源研究所 Earth face evapotranspiration remote sensing inversion method and system based on MODIS data
CN105842707A (en) * 2015-01-15 2016-08-10 兰州大学 Grassland above-ground biomass measuring method and grassland above-ground biomass measuring device based on remote sensing image acquired by unmanned aerial vehicle
CN110186823A (en) * 2019-06-26 2019-08-30 中国科学院遥感与数字地球研究所 A kind of aerosol optical depth inversion method
CN111766577A (en) * 2020-07-27 2020-10-13 云南电网有限责任公司昆明供电局 Power transmission line channel tree height inversion method based on three-stage algorithm P wave band
CN114120140A (en) * 2021-11-18 2022-03-01 浙江大学德清先进技术与产业研究院 Method for automatically extracting building height based on satellite image
CN114241331A (en) * 2021-12-16 2022-03-25 中国科学院南京地理与湖泊研究所 Wetland reed aboveground biomass remote sensing modeling method taking UAV as ground and Sentinel-2 intermediary
CN114371182A (en) * 2022-03-22 2022-04-19 中国科学院地理科学与资源研究所 Satellite-borne GNSS-R high-precision soil moisture estimation method based on CYGNSS data
CN116127327A (en) * 2023-04-07 2023-05-16 广东省科学院广州地理研究所 Forest ground biomass inversion method, device, equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8306942B2 (en) * 2008-05-06 2012-11-06 Lawrence Livermore National Security, Llc Discriminant forest classification method and system
US10491879B2 (en) * 2016-01-15 2019-11-26 Blue River Technology Inc. Plant feature detection using captured images
US10125464B2 (en) * 2016-03-14 2018-11-13 The United States Of America As Represented By The Secretary Of The Army Photogrammetric soil density system and method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103149220A (en) * 2013-01-30 2013-06-12 中国科学院对地观测与数字地球科学中心 Soil moisture inversion method of mono-frequency microwave radiometer
CN103810387A (en) * 2014-02-13 2014-05-21 中国科学院地理科学与资源研究所 Earth face evapotranspiration remote sensing inversion method and system based on MODIS data
CN105842707A (en) * 2015-01-15 2016-08-10 兰州大学 Grassland above-ground biomass measuring method and grassland above-ground biomass measuring device based on remote sensing image acquired by unmanned aerial vehicle
CN110186823A (en) * 2019-06-26 2019-08-30 中国科学院遥感与数字地球研究所 A kind of aerosol optical depth inversion method
CN111766577A (en) * 2020-07-27 2020-10-13 云南电网有限责任公司昆明供电局 Power transmission line channel tree height inversion method based on three-stage algorithm P wave band
CN114120140A (en) * 2021-11-18 2022-03-01 浙江大学德清先进技术与产业研究院 Method for automatically extracting building height based on satellite image
CN114241331A (en) * 2021-12-16 2022-03-25 中国科学院南京地理与湖泊研究所 Wetland reed aboveground biomass remote sensing modeling method taking UAV as ground and Sentinel-2 intermediary
CN114371182A (en) * 2022-03-22 2022-04-19 中国科学院地理科学与资源研究所 Satellite-borne GNSS-R high-precision soil moisture estimation method based on CYGNSS data
CN116127327A (en) * 2023-04-07 2023-05-16 广东省科学院广州地理研究所 Forest ground biomass inversion method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GNSS多星定位数据的质量分析;布金伟 等;《昆明理工大学学报(自然科学版)》;第42卷(第6期);第24页-第36页 *
The quality analysis of GNSS satellite positioning data;Xiaoqing Zuo 等;《Cluster Computing》;第22卷(第3期);第56693页-第56708页 *

Also Published As

Publication number Publication date
CN116881721A (en) 2023-10-13

Similar Documents

Publication Publication Date Title
CN111479231B (en) Indoor fingerprint positioning method for millimeter wave large-scale MIMO system
Parida et al. Polarimetric decomposition methods for LULC mapping using ALOS L-band PolSAR data in Western parts of Mizoram, Northeast India
CN112488008B (en) Soil moisture measuring method and device based on multi-source remote sensing data fusion
CN108766203B (en) Compact polarization rice mapping method and system
CN104361338A (en) Peat bog information extracting method based on ENVISAT ASAR, Landsat TM and DEM data
CN103837873A (en) Microwave and stare correlated imaging system and method based on floating platform and intensive array antennae
Liu et al. Estimation of vegetation parameters of water cloud model for global soil moisture retrieval using time-series L-band Aquarius observations
CN114779215A (en) Data denoising method for spaceborne photon counting laser radar in planting coverage area
CN110516552B (en) Multi-polarization radar image classification method and system based on time sequence curve
CN116881721B (en) Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm
He et al. Object-based distinction between building shadow and water in high-resolution imagery using fuzzy-rule classification and artificial bee colony optimization
CN113534083B (en) SAR-based corn stubble mode identification method, device and medium
CN113960625A (en) Water depth inversion method based on satellite-borne single photon laser active and passive remote sensing fusion
CN112379373A (en) Satellite-borne SAR real-time imaging device
CN111178186A (en) Rice extraction method, device and equipment based on sentinel remote sensing data
CN114545410B (en) Crop lodging monitoring method based on synthetic aperture radar dual-polarized data coherence
Srivastava et al. Potential applications of multi-parametric synthetic aperture radar (SAR) data in wetland inventory: a case study of Keoladeo National Park (A World Heritage and Ramsar site), Bharatpur, India
CN114814847A (en) Millimeter wave radar power line detection and three-dimensional reconstruction method
Wadoux et al. Shapley values reveal the drivers of soil organic carbon stocks prediction
CN114152936A (en) Satellite-borne waveform laser radar ground elevation precision evaluation method for forest research area
Wang et al. Coastal Sea Surface Temperature Inversion from Microwave Radiometer using Radial Basis Function Neural Network
CN115657035B (en) Inter-satellite cooperation-based polarized synthetic aperture radar imaging method and device
Huang et al. Multi‐source data‐based method for retrieval of soil moisture in grassland
CN115546658B (en) Night cloud detection method combining quality improvement and CNN improvement of data set
CN115617820B (en) Deep learning data set manufacturing method for quantitative precipitation estimation of position-related radar

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant