CN116881721A - Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm - Google Patents

Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm Download PDF

Info

Publication number
CN116881721A
CN116881721A CN202310918757.8A CN202310918757A CN116881721A CN 116881721 A CN116881721 A CN 116881721A CN 202310918757 A CN202310918757 A CN 202310918757A CN 116881721 A CN116881721 A CN 116881721A
Authority
CN
China
Prior art keywords
data
vegetation
optical thickness
gnss
cygnss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310918757.8A
Other languages
Chinese (zh)
Other versions
CN116881721B (en
Inventor
布金伟
张永凤
左小清
朱大明
李勇发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN202310918757.8A priority Critical patent/CN116881721B/en
Publication of CN116881721A publication Critical patent/CN116881721A/en
Application granted granted Critical
Publication of CN116881721B publication Critical patent/CN116881721B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • G06F18/15Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/188Vegetation

Abstract

The application discloses a method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm, which comprises the following steps: acquiring satellite-borne GNSS-R, SMAP and MODIS data of the CYGNSS satellites respectively; preprocessing and quality control of all acquired data sets; data space-time matching and data division; extracting CYGNSS reflection point data on vegetation by using MODIS land coverage type data; removing CYGNSS emission point data on the land open water area by using MODIS land open water area data; calculating the emissivity of the CYGNSS earth surface; constructing a satellite-borne GNSS-R vegetation optical thickness inversion model; and inputting the test data set into the trained ET model to obtain an inversion vegetation optical thickness value, and evaluating the result. The application utilizes the integrated machine learning algorithm to fuse the bistatic radar scattering section, the effective scattering area, the CYGNSS variable parameter and the earth surface auxiliary parameter to construct an extremely random tree integrated machine learning model, and can realize the vegetation optical thickness inversion with high precision and high space-time resolution in the global land range.

Description

Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm
Technical Field
The application belongs to the technical field of GNSS vegetation optical thickness inversion, and particularly relates to a method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm.
Background
The vegetation optical thickness (Vegetation Optical Depth, VOD) is an important parameter for characterizing vegetation climate change in the microwave remote sensing inversion process, and is used as a vegetation index sensitive to vegetation moisture and biomass content based on microwaves, and has been increasingly applied to study the influence of global climate and environmental change on vegetation. Since the optical thickness of vegetation is sensitive to and positively correlated with the growth variation of vegetation, the trend of the optical thickness of vegetation is commonly used in practical application to reflect the growth condition of vegetation. Therefore, a vegetation optical thickness inversion algorithm with higher feasibility is developed, and the vegetation optical thickness inversion algorithm has important practical significance for agricultural condition monitoring, vegetation biomass estimation, soil moisture inversion and the like. Remote sensing is used as a comprehensive, macroscopic, rapid and real-time observation means, and is one of effective tools for monitoring surface parameters. However, conventional remote sensing techniques are susceptible to weather conditions (e.g., areas with high cloud coverage) and to surface disturbances (e.g., soil conditions, terrain, etc.). Thereby affecting the accuracy of the vegetation optical thickness inversion. The global navigation satellite system reflection remote sensing (Global Navigation Satellite System-Reflectometry, GNSS-R) technology is an emerging technology which is rapidly developed in the 90 th century and detects earth surface information based on reflection characteristics of GNSS signals, the receiver carried on the satellite is used for receiving direct signals directly from the GNSS satellites and echo signals reflected by a reflection surface, a delay Doppler diagram is obtained through processing, corresponding to two-dimensional correlation power of the reflection signals, and physical parameter information of an earth surface scattering surface can be obtained through a certain inversion method, so that the technology is one of research hotspots in the field of remote sensing detection at home and abroad at present. Besides the remarkable advantages of rich signal sources, low cost, high space-time resolution and the like, the satellite-borne GNSS-R has the advantages that the L-band signal sources carried by the GNSS-R not only can easily penetrate through cloud layers and the atmosphere, but also can realize all-day and all-weather remote sensing monitoring. Therefore, the GNSS-R technology is of great significance in realizing all-weather, all-day and wide-coverage vegetation optical thickness inversion.
At present, the research on inversion of vegetation optical thickness mainly based on a soil moisture algorithm of microwave remote sensing at home and abroad mainly comprises four types: (1) SMOS inversion algorithm. The algorithm obtains the vegetation optical thickness and soil moisture by using a multi-angle and dual-polarization method to obtain a new SMOS-IC product. (2) single-channel algorithm of SMAP. The algorithm can only invert soil moisture, and the inversion process depends on vegetation optical thickness, surface roughness, surface temperature and other auxiliary information. (3) The UMT algorithm of the AMSR2 adopts brightness data of a single frequency band (18.7 Ghz), and performs inversion of vegetation optical thickness by using the calculated surface temperature, atmospheric water vapor and other auxiliary information in the four-channel model. However, most of researches on GNSS-R reflection remote sensing are focused on inversion researches on soil humidity at present, and few researches on establishing an integrated machine learning model for inverting vegetation optical thickness by fusing CYGNSS, SMAP and MODIS data are performed.
The present application has been made in view of this.
Disclosure of Invention
The technical problem to be solved by the application is to overcome the defects of the prior art, and provide a method for inverting vegetation optical thickness (Vegetation Optical Depth, VOD) by combining satellite-borne GNSS-R data and an integrated machine learning algorithm.
In order to solve the technical problems, the application adopts the basic conception of the technical scheme that:
a method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm comprises the following steps:
step S1, acquiring satellite-borne GNSS-R, earth surface auxiliary parameters SMAP and MODIS data of a CYGNSS satellite respectively;
step S2, preprocessing and quality controlling all acquired data sets;
step S3, data space-time matching and data division;
s4, extracting CYGNSS reflection point data on vegetation by using MODIS land and coverage type data;
s5, removing CYGNSS reflection point data on the land open water area by using MODIS land open water area data;
s6, calculating the CYGNSS earth surface reflectivity;
s7, constructing a satellite-borne GNSS-R vegetation optical thickness inversion model by using reflected signal images, observation variables and earth surface auxiliary variable parameters observed by the CYGNSS satellites;
and S8, inputting the test data set into a trained ET model to obtain an inversion vegetation optical thickness value, and comparing and evaluating the result with a Decision Tree (DT), a gradient lifting tree (Gradient Boosted Decision Trees, GBDT), an adaptive enhancement (Adaptive Boosting, adaBoost) and a support vector machine (Support Vector Machines, SVM) model.
Preferably, the CYGNSS satellite-borne GNSS-R used therein comprises quality identification (quality_flag), power DDM (power_analog), GNSS-R receiver to specular reflection point distance (rx_to_sp_range), GNSS transmitter to specular reflection point distance (tx_to_sp_range), GNSS-R receiver's receiving antenna gain (sp_rx_gain), specular reflection point latitude (sp_lat), specular reflection point longitude (sp_lon), specular reflection point entry angle (sp_inc_angle), GPS transmitter's equivalent isotropic radiation power (gps_eirp), DDM signal-to-noise ratio (DDM _snr), bistatic radar scattering cross section (BRCS), effective scattering area (eff_scanner), and the like; the surface auxiliary parameters include six parameter data of soil humidity (soil_movement), surface temperature (surface_temperature), vegetation water content (vegetation_water_content), roughness coefficient (roughness_coeffective), vegetation optical thickness (vegetation_availability_dca), and quality identification (real_quality_flag). The MODIS data comprises 16 land utilization coverage types, and specifically comprises the following steps: 9 vegetation cover types such as evergreen conifer, evergreen broadleaf, deciduous conifer, mixed forest, closing shrub, multi-tree grassland, sparse shrub, sparse grassland and the like; water, permanent wetland, cultivated land, city and building, farmland/vegetation hybrid land, permanent ice and snow, and 7 kinds of non-vegetation land coverage types.
Preferably, the quality control of the data set in step S2 includes:
firstly, deleting all observed values containing NaN values;
secondly, all observations less than 0 need to be discarded;
third, the RCG value should be greater than 3;
fourth, if the gain (sp_rx_gain) of the receiving antenna in the reflection point direction is less than 0dBi, discarding is required;
fifth, if the uncertainty of BRCS (ddm _brcs_uncert) is greater than 1, it also needs to be discarded;
sixth, the sampling point of the incident angle needs to be greater than 65 °;
seventh, the signal-to-noise ratio discards sampling points less than or equal to 0 dB; in order to ensure that errors of specular reflection points caused by terrain are within a reasonable range, sampling points of delay lines beyond [7, 10] where DDM peak power is located are removed; in addition, sampling points with low precision are removed according to quality marks (quality_flags) provided in the data. The variable quality_flags gives a total of 31 quality flags, such as: "Overall poor quality data", "whether the S-band transmitter is powered on", "angular range of roll or pitch or yaw of the spacecraft", "whether the DDM sampling period reconfigures the DDMI", "DDM CRC for DDMI transmission to the spacecraft computer is invalid", "whether the DDM is a test pattern for DDMI generation", "whether the reflector channel tracks PRN", "whether the point is on land", and so forth. The 31 quality marks are stored in a variable, the setting which accords with the condition that the quality marks belong to is 1, the setting which does not accord with the condition is 0, 31-bit binary numbers are generated according to a certain sequence, and the binary numbers are converted into decimal numbers and stored in the variable. The data filtering is applied using a single QC bit during the data filtering process, rather than the entire QC bit ("poor overall quality", least significant bit or bit 0). The quality control mark sp_over_land (bit 11) is reserved to reject marine data to obtain land data, and otherwise, DDM invalid or abnormal data (bits 4, 7, 8, 9, 10, 15, 17 and 18) are deleted; deleting data (bits 5, 6, 23, 26) of other instruments that have data transmission and calibration problems; deleting spacecraft data (bit 2) with larger attitude errors or abnormal attitude; the coding errors measured by the variable fsw_comp_delay_shift and the variable fsw_comp_dopp_shift flight software are deleted (bit 27).
Furthermore, step S3 comprises the following sub-steps,
and S3.1, extending outwards to form a grid of 10km multiplied by 10km by taking the longitude and latitude of a CYGNSS network center of 36km as a standard, respectively interpolating soil humidity, surface temperature, roughness coefficient, vegetation optical thickness and vegetation water content data by adopting a nearest neighbor method, and selecting specular reflection points reserved in the grid to finally obtain space-matched CYGNSS satellite-borne GNSS-R, SMAP data.
And S3.2, the network center longitude and latitude of the 36km CYGNSS is used as a standard to extend outwards to form a 6km multiplied by 6km grid, the MODIS land utilization coverage type in the grid is reserved by adopting a nearest neighbor method, and the nearest is selected as an optimal value.
Step S3.3, randomly selecting and dividing the data set after spatial matching into a training set, a verification set and a test set, wherein the training set, the verification set and the test set respectively account for 60%, 10% and 30% of the data set after filtering.
And S4, extracting CYGNSS reflection point data on vegetation, reserving the reflection point data on the vegetation, and eliminating the reflection point data on non-vegetation type data.
Preferably, the land water body near the CYGNSS reflection point is removed by using the MODIS land water area data in step S5 to eliminate the interference of the water body on the reflected signal.
Preferably, the calculation of the surface reflectivity in step S6 is calculated using the following formula:
p in the formula r Peak power for DDM; p (P) t G t An equivalent omni-directional radiated power for the transmitter; p (P) t The transmitting power of the GNSS satellite right-hand circularly polarized navigation signal is set; g t Gain for the transmit antenna; g r Gain for the receiver antenna; lambda is the carrier wavelength of the transmitted signal; r is R r And R is t The distance from the transmitter to the surface emission point and the distance from the receiver to the surface emission point are respectively determined. The relevant parameters used in the calibration equation can be obtained in the CYGNSS product file, wherein P t G t Using the equivalent isotropic radiated power (Effective Isotropic Radiated Power, EIRP) provided in the CYGNSS product as a calculation result, N is DDM noise floor, and the power average calculation formula defined as the specified noise region is as follows:
wherein τ and τ i Delay bin boundaries for specified noise regions; f (f) 1 And f i Doppler frequency boundaries for designated noise regions; m is the number of pixels in the noise region; DDM (τ, f) is a DDM power value at a specified position.
Normalizing the filtered data set into zero mean and unit variance according to the characteristics, preprocessing all the acquired data sets, and dividing the obtained filtered data set into a training set, a verification set and a test set;
wherein the training set is used to train the model and the test set is used to evaluate the performance of the model. The training set is trained by adopting an ET model. The ET model will randomly draw a threshold for each candidate feature by an extreme stochastic tree regressor and select the best of these randomly generated thresholds as the segmentation rule. And inputting SMAP VOD data by using the trained ET model so as to save the optimal ET model for inversion of VOD.
After training is completed, the test data set is input into the trained ET model to obtain inverted vegetation optical thickness values, and the results are evaluated.
According to the application, the ET model based on the integrated machine learning algorithm is adopted to realize inversion of the global vegetation optical thickness, and super-parameter setting is required when the ET model is used so as to find an optimal model for VOD inversion. The parameters that need to be adjusted when using the model are mainly n_optimators and max_features. n_evators represents the number of trees in the forest, and generally the larger the number, the better the effect. However, when the number of trees exceeds a critical value, the effect of the algorithm is not optimal, so the value of n_evastiators is set to 10 through the experiments of the present application in order to achieve the optimal effect. And max_features is the size of the random subset of features considered when partitioning the nodes. The lower the value, the more the variance decreases. The max_features=none, max_dept=none, min_samples_split=2, boottrap=false and oob _score=true are set in the extratreesregress regression problem, and the optimal parameter values are obtained through cross-validation. The inversion accuracy of the optimal ET model obtained after training on VOD is higher.
By adopting the technical scheme, compared with the prior art, the application has the following beneficial effects.
The application utilizes an integrated machine learning algorithm to fuse a bistatic radar scattering cross section (Bistatic Radar Cross Section, BRCS), an effective scattering area (Effective Scattering Area), a CYGNSS variable parameter and an earth surface auxiliary parameter to construct an extreme random tree (Extremely randomized trees, ET) integrated machine learning model, and the ET model randomly draws a threshold value for each candidate feature through an extreme random tree regressor, and selects the best of the randomly generated threshold values as a segmentation rule. By adopting the technical scheme of the application, the inversion of the vegetation optical thickness with high precision and high space-time resolution in the global land range can be realized.
The following describes the embodiments of the present application in further detail with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. It is evident that the drawings in the following description are only examples, from which other drawings can be obtained by a person skilled in the art without the inventive effort. In the drawings:
FIG. 1 is a flow chart of the method of the present application.
Fig. 2 is a block diagram of an ET model of the present application.
FIG. 3 is a graph of VOD scatter density for five model inversions in an embodiment of the present application.
FIG. 4 is a graph comparing inversion VOD and SMAP VOD of different machine learning models in an embodiment of the present application.
FIG. 5 is a histogram of the distribution of the deviations between the five model inversion VOD and SMAP VOD in an embodiment of the application.
It should be noted that these drawings and the written description are not intended to limit the scope of the inventive concept in any way, but to illustrate the inventive concept to those skilled in the art by referring to the specific embodiments.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions in the embodiments will be clearly and completely described with reference to the accompanying drawings in the embodiments of the present application, and the following embodiments are used to illustrate the present application, but are not intended to limit the scope of the present application.
Example 1
To verify the advantages of the proposed method, quality identification (quality_flag), DDM (power_analog), GNSS-R receiver to specular reflection point distance (rx_to_sp_range), GNSS transmitter to specular reflection point distance (tx_to_sp_range), GNSS-R receiver's receiving antenna gain (sp_rx_gain), specular reflection point latitude (sp_lat), specular reflection point longitude (sp_lon), specular reflection point entry angle (sp_inc_angle), GPS transmitter's equivalent isotropic radiated power (gps_eirp), DDM signal-to-noise ratio (DDM _sr), bistatic radar scattering cross section (BRCS), effective scattering area (eff_scattering), etc. among the CYGNSS satellite-borne GNSS-R data are obtained; the surface auxiliary parameters include six parameter data of soil humidity (soil_movement), surface temperature (surface_temperature), vegetation water content (vegetation_water_content), roughness coefficient (roughness_coeffective), vegetation optical thickness (vegetation_availability_dca), and quality identification (real_quality_flag). The MODIS data comprises 16 land utilization coverage types, and specifically comprises the following steps: 9 vegetation cover types such as evergreen conifer, evergreen broadleaf, deciduous conifer, mixed forest, closing shrub, multi-tree grassland, sparse shrub, sparse grassland and the like; water, permanent wetland, cultivated land, city and building, farmland/vegetation hybrid land, permanent ice and snow, and 7 kinds of non-vegetation land coverage types. And comparing the experimental result of the application with the vegetation optical thickness result inverted by the DT, GBDT, adaBoost, SVM model. The basic configuration of the experimental platform of this embodiment is shown in table 1:
TABLE 1 configuration of experiment platform
A method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm is shown in the accompanying figure 1, and the implementation flow of the technical scheme comprises the following steps:
step S1, acquiring satellite-borne GNSS-R, SMAP and MODIS data of a CYGNSS satellite respectively;
step S2, preprocessing and quality controlling all acquired data sets;
step S3, data space-time matching and data division;
s4, extracting CYGNSS reflection point data on vegetation by using MODIS land and coverage type data;
s5, removing CYGNSS reflection point data on the land open water area by using MODIS land open water area data;
s6, calculating the CYGNSS earth surface reflectivity;
s7, constructing a satellite-borne GNSS-R vegetation optical thickness inversion model by using reflected signal images, observation variables and earth surface auxiliary variable parameters observed by the CYGNSS satellites;
step S8, inputting the test data set into the trained ET model to obtain inversion vegetation optical thickness values, and comparing and evaluating the results with Decision Trees (DT), gradient lifting Trees (Gradient Boosted Decision Trees, GBDT), adaptive enhancement (Adaptive Boosting, adaBoost) and support vector machine (Support Vector Machines, SVM) models.
As an implementation manner of this embodiment, in step S1, the above-mentioned CYGNSS satellite-borne GNSS-R includes quality identifier (quality_flag), DDM (power_analog), GNSS-R receiver to specular reflection point distance (rx_to_sp_range), GPS transmitter to specular reflection point distance (tx_to_sp_range), receiving antenna gain (sp_rx_gain) of GNSS-R receiver, specular reflection point latitude (sp_lat), specular reflection point longitude (sp_lon), specular reflection point entry angle (sp_inc_angle), equivalent isotropic radiation power (GPS transmitter (gps_eirp), DDM signal-to-noise ratio (DDM _snr), bistatic radar scattering cross section (BRCS), effective scattering area (eff_scattering), and so on; the surface assist parameters include six parameter data including soil moisture (soil_movement), surface temperature (surface_temperature), vegetation water content (vegetation_water_content), roughness coefficient (roughness_coeffective), vegetation optical thickness (vegetation_area_dca), and quality identification (real_quality_flag). The MODIS data comprises 16 land utilization coverage types, and specifically comprises the following steps: 9 vegetation cover types such as evergreen conifer, evergreen broadleaf, deciduous conifer, mixed forest, closing shrub, multi-tree grassland, sparse shrub, sparse grassland and the like; water, permanent wetland, cultivated land, city and building, farmland/vegetation hybrid land, permanent ice and snow, and 7 kinds of non-vegetation land coverage types.
As one implementation manner of this embodiment, preferably, the data set quality control in step S2 includes: firstly, deleting all observed values containing NaN values; secondly, all observations less than 0 need to be discarded; third, the RCG value should be greater than 3; fourth, if the gain (sp_rx_gain) of the receiving antenna in the reflection point direction is less than 0dBi, discarding is required; fifth, if the uncertainty of BRCS (ddm _brcs_uncert) is greater than 1, it also needs to be discarded, and sixth, the sampling point of the incident angle needs to be greater than 65 °; seventh, the signal-to-noise ratio discards sampling points less than or equal to 0 dB; in order to ensure that errors of specular reflection points caused by terrain are within a reasonable range, sampling points of delay lines beyond [7, 10] where DDM peak power is located are removed; in addition, sampling points with low precision are removed according to quality marks (quality_flags) provided in the data. The variable quality_flags gives a total of 31 quality flags, such as: "Overall poor quality data", "whether the S-band transmitter is powered on", "angular range of roll or pitch or yaw of the spacecraft", "whether the DDM sampling period reconfigures the DDMI", "DDM CRC for DDMI transmission to the spacecraft computer is invalid", "whether the DDM is a test pattern for DDMI generation", "whether the reflector channel tracks PRN", "whether the point is on land", and so forth. The 31 quality marks are stored in a variable, the setting which accords with the condition that the quality marks belong to is 1, the setting which does not accord with the condition is 0, 31-bit binary numbers are generated according to a certain sequence, and the binary numbers are converted into decimal numbers and stored in the variable. The data filtering is applied using a single QC bit during the data filtering process, rather than the entire QC bit ("poor overall quality", least significant bit or bit 0). The quality control mark sp_over_land (bit 11) is reserved to reject marine data to obtain land data, and otherwise, DDM invalid or abnormal data (bits 4, 7, 8, 9, 10, 15, 17 and 18) are deleted; deleting data (bits 5, 6, 23, 26) of other instruments that have data transmission and calibration problems; deleting spacecraft data (bit 2) with larger attitude errors or abnormal attitude; the coding errors measured by the variable fsw_comp_delay_shift and the variable fsw_comp_dopp_shift flight software are deleted (bit 27).
As an implementation of this embodiment, step S3 comprises the following sub-steps,
and S3.1, extending outwards to form a grid of 10km multiplied by 10km by taking the longitude and latitude of a CYGNSS network center of 36km as a standard, respectively interpolating soil humidity, surface temperature, roughness coefficient, vegetation optical thickness and vegetation water content data by adopting a nearest neighbor method, and selecting specular reflection points reserved in the grid to finally obtain space-matched CYGNSS satellite-borne GNSS-R, SMAP data.
And S3.2, the network center longitude and latitude of the 36km CYGNSS is used as a standard to extend outwards to form a 6km multiplied by 6km grid, the MODIS land utilization coverage type in the grid is reserved by adopting a nearest neighbor method, and the nearest is selected as an optimal value.
Step S3.3, randomly selecting and dividing the data set after spatial matching into a training set, a verification set and a test set, wherein the training set, the verification set and the test set respectively account for 60%, 10% and 30% of the data set after filtering.
As an implementation manner of this embodiment, step S4 extracts CYGNSS reflection point data on vegetation by using the MODIS land coverage type data, retains the reflection point data on vegetation, and eliminates the reflection point data on non-vegetation type data.
Preferably, the land water body near the CYGNSS reflection point is removed by using the MODIS land water area data in step S5 to eliminate the interference of the water body on the reflected signal.
As an implementation of the present embodiment, the calculation of the surface reflectivity described in step S6 is used
P in the formula r Peak power for DDM; p (P) t G t An equivalent omni-directional radiated power for the transmitter; p (P) t The transmitting power of the GNSS satellite right-hand circularly polarized navigation signal is set; g t Gain for the transmit antenna; g r Gain for the receiver antenna; lambda is the carrier wavelength of the transmitted signal; r is R r And R is t The distance from the transmitter to the surface emission point and the distance from the receiver to the surface emission point are respectively determined. The relevant parameters used in the calibration equation can be obtained in the CYGNSS product file, wherein P t G t As calculation results, equivalent isotropic radiated power (Effective Isotropic Radiated Power, EIRP) provided in the CYGNSS product was used. N is DDM noise floor, defined as the power average value of the appointed noise area, the calculation formula is as follows:
wherein τ and τ i Delay bin boundaries for specified noise regions; f (f) 1 And f i Doppler frequency boundaries for designated noise regions; m is the number of pixels in the noise region; DDM (τ, f) is a DDM power value at a specified position.
The reflected signal image of the CYGNSS satellite observations of step S7 includes BRCS, effective Scattering Area. The observation variables include GNSS-R observation parameters and surface emissivity and surface aiding variable parameters.
As shown in fig. 2, the training process of inverting the global vegetation optical thickness by the ET model method described in step S8 specifically includes:
normalizing the filtered data set into zero mean and unit variance according to the characteristics, preprocessing all the acquired data sets, and dividing the obtained filtered data set into a training set, a verification set and a test set;
wherein the training set is used to train the model and the test data set is used to evaluate the performance of the model. The training set is used for training the ET model. The ET model will randomly draw a threshold for each candidate feature by an extreme stochastic tree regressor and select the best of these randomly generated thresholds as the segmentation rule. And inputting SMAP VOD data for training the ET model, storing the optimal ET model to complete training, inputting the test data set into the trained ET model, obtaining the vegetation optical thickness value and evaluating the result. The Root Mean Square Error (RMSE), pearson Correlation Coefficient (CC), mean Absolute Error (MAE) and Mean Absolute Percent Error (MAPE) are used as indexes for evaluating the inversion performance of the model, and the calculation formulas are as follows:
in the above formula, n is the number of samples, X i And Y i Model inversion VOD and SMAP reference VOD values respectively,and->The inverted VOD average and the reference VOD average, respectively.
The trained ET model is adopted to carry out global VOD inversion, and the experimental result of the application is compared with the DT, GBDT, adaBoost, SVM model result to evaluate, and the final result is shown in a table 2, and inversion accuracy statistics of different inversion methods are respectively given. From the table the following conclusions can be drawn:
the correlation coefficient of the ET model method is optimal, the performances of the ET model method are obviously superior to those of the AdaBoost, GBDT and SVM model methods, and the precision of the ET model is respectively improved by 178.12%, 67.87% and 85.26% compared with the precision of other three models in the RMSE; the MAE is respectively improved by 87.50%, 80.00% and 92.31%; the MAPE is respectively improved by 87.50 percent, 80.00 percent and 92.31 percent.
Table 2 accuracy of different models to invert wind speed on test dataset
In addition, in order to compare the outstanding improvement of the inversion vegetation optical thickness in the correlation between the inversion vegetation optical thickness and the SMAP vegetation optical thickness product in the present embodiment, as shown in fig. 3, it can be seen that the correlation between the ET model inversion VOD and SMAP VOD in the present embodiment is better, and is better than the DT, adaBoost, SVM, GBDT model method, the ET model method has the most points distributed around the line of y=x, and has relatively fewer scattered points, which indicates that the correlation between the ET inversion vegetation optical thickness and SMAP VOD is the best; in contrast, the correlation between the two models AdaBoost, SVM inversion VOD and SMAP VOD is the worst, which clearly shows that more auxiliary parameters are needed to obtain better results among the two models. However, the ET model provided by the embodiment fuses the surface emissivity, the surface auxiliary parameters and the CYGNSS satellite-borne CYGNSS-R variable parameters, so that the robustness, the stability and the universality of the model are improved well.
As shown in fig. 4, the ET model provided in the embodiment obtains higher inversion precision in inverting the global vegetation optical thickness, and is more consistent with SMAP global VOD; adaBoost, DT, ET, SVM, GBDT corresponds to RMSE of 0.098, 0.027, 0.021, 0.067, 0.145, respectively, where the accuracy of the ET model is highest, and the ET model is improved by 78.57%, 22.22%, 68.66%, and 85.52% over AdaBoost, DT, SVM, GBDT, respectively.
The distribution histogram of the deviations between ET model inversion VOD and SMAP VOD is given as shown in fig. 5 (average deviation (μ), standard deviation (σ), average absolute error (MAE), and 80% quantile of the deviation (Qua) are given in the figure, blue bar graph represents the error distribution, red dashed line represents the probability density function fit curve of the error, and green dashed line represents the deviation between ET model inversion and SMAP inversion VOD as 0). As can be seen from the graph, the deviation between ET model inversion VOD and SMAP VOD is very concentrated (80% deviation is less than 0.08) and the ET model inversion VOD and SMAP are mostly distributed around 0; compared with two models, namely AdaBoost and SVM, the extreme random tree integrated machine learning method provided by the application has remarkable advantages in the aspect of inverting the global vegetation optical thickness.
The foregoing description is only illustrative of the preferred embodiment of the present application, and is not to be construed as limiting the application, but is to be construed as limiting the application to any and all simple modifications, equivalent variations and adaptations of the embodiments described above, which are within the scope of the application, may be made by those skilled in the art without departing from the scope of the application.

Claims (9)

1. The method for inverting the vegetation optical thickness by combining the satellite-borne GNSS-R data and the integrated machine learning algorithm is characterized by comprising the following steps of:
step S1, acquiring satellite-borne GNSS-R, earth surface auxiliary parameters SMAP and MODIS data of a CYGNSS satellite respectively;
step S2, preprocessing and quality controlling all acquired data sets;
step S3, data space-time matching and data division;
s4, extracting CYGNSS reflection point data on vegetation by using MODIS land and coverage type data;
s5, removing CYGNSS reflection point data on the land open water area by using MODIS land open water area data;
s6, calculating the CYGNSS earth surface reflectivity;
s7, constructing a satellite-borne GNSS-R vegetation optical thickness inversion model by using reflected signal images, observation variables and earth surface auxiliary variable parameters observed by the CYGNSS satellites;
and S8, inputting the test data set into the trained ET model to obtain an inversion vegetation optical thickness value, and comparing and evaluating the result with a decision tree, a gradient lifting tree, a self-adaptive enhancement and support vector machine model.
2. The method for inverting vegetation optical thickness by combining on-board GNSS-R data and integrated machine learning algorithm according to claim 1, wherein in step 1:
the satellite-borne GNSS-R comprises a mass identifier, a power DDM, a distance from a GNSS-R receiver to a specular reflection point, a distance from a GNSS transmitter to the specular reflection point, a receiving antenna gain of the GNSS-R receiver, a latitude of the specular reflection point, longitude of the specular reflection point, an incident angle of the specular reflection point, an equivalent isotropic radiation power of a GPS transmitter, a DDM signal-to-noise ratio, a scattering cross section of a bistatic radar and an effective scattering area;
the earth surface auxiliary parameters comprise soil humidity, earth surface temperature, vegetation water content, roughness coefficient, vegetation optical thickness and quality mark of the lifting rail;
the MODIS data includes 9 vegetation cover types, 7 non-vegetation land cover types, and the 9 vegetation cover types include: evergreen conifer, evergreen broadleaf, deciduous conifer, hybrid, shrub, multi-tree grassland, sparse shrub, sparse grassland; the 7 types of non-vegetation land cover include bodies of water, permanent wetlands, cultivated lands, cities and buildings, farmland/vegetation hybrids, permanent ice and snow, and barren lands.
3. The method for inverting vegetation optical thickness by combining on-board GNSS-R data and integrated machine learning algorithm according to claim 1, wherein the preprocessing and quality control of all acquired data sets in step S2 comprises:
s2.1, deleting all observed values containing NaN values;
s2.2, discarding all observed values smaller than 0;
s2.3, the RCG value should be greater than 3;
s2.4, if the gain (sp_rx_gain) of the receiving antenna in the direction of the reflection point is less than 0dBi, discarding is needed;
s2.5, if the uncertainty of the BRCS (ddm _BRCS_uncert) is greater than 1, discarding is also needed;
s2.6, the sampling point of the incident angle needs to be larger than 65 degrees;
s2.7, discarding sampling points with the signal-to-noise ratio smaller than or equal to 0 dB; in order to ensure that errors of specular reflection points caused by terrain are within a reasonable range, sampling points of delay lines beyond [7, 10] where DDM peak power is located are removed; in addition, sampling points with low precision are removed according to quality marks (quality_flags) provided in the data.
4. The method for inverting vegetation optical thickness by combining on-board GNSS-R data and integrated machine learning algorithm according to claim 1, wherein step S3 of data space-time matching and data partitioning comprises the steps of:
s3.1, extending outwards to form a 10km multiplied by 10km grid by taking the longitude and latitude of a CYGNSS network center of 36km as a standard, respectively interpolating soil humidity, surface temperature, roughness coefficient, vegetation optical thickness and vegetation water content data by adopting a nearest neighbor method, and selecting specular reflection points reserved in the grid to finally obtain space-matched CYGNSS satellite-borne GNSS-R, SMAP data;
s3.2, extending outwards to form a 6km multiplied by 6km grid by taking the longitude and latitude of a 36km CYGNSS network center as a standard, reserving the MODIS land utilization coverage type in the grid by adopting a nearest neighbor method, and selecting the nearest as an optimal value;
step S3.3, randomly selecting and dividing the data set after spatial matching into a training set, a verification set and a test set, wherein the training set, the verification set and the test set respectively account for 60%, 10% and 30% of the data set after filtering.
5. The method for inverting vegetation optical thickness by combining on-board GNSS-R data and an integrated machine learning algorithm according to claim 1, wherein in step S4, the CYGNSS reflection point data on vegetation is extracted by using the MODIS land coverage type data, the reflection point data on vegetation is reserved, and the reflection point data on non-vegetation type data is removed.
6. The method for inverting vegetation optical thickness by combining on-board GNSS-R data and integrated machine learning algorithm according to claim 1, wherein in step S5, land water around CYGNSS reflection points is removed by using MODIS land water data to eliminate interference of water on reflection signals.
7. The method for inverting vegetation optical thickness by combining on-board GNSS-R data and integrated machine learning algorithm according to claim 1, wherein the calculation of the surface reflectivity in step S6 is as follows:
p in the formula r Peak power for DDM; n is DDM noise floor; p (P) t G t An equivalent omni-directional radiated power for the transmitter; p (P) t The transmitting power of the GNSS satellite right-hand circularly polarized navigation signal is set; g t Gain for the transmit antenna; g r Gain for the receiver antenna; lambda is the carrier wavelength of the transmitted signal; r is R r And R is t The distance from the transmitter to the surface emission point and the distance from the receiver to the surface emission point are respectively.
8. The method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm according to claim 1, wherein the reflected signal image observed by the CYGNSS satellite in step S7 includes BRCS, effective scattering area; the observation variables include GNSS-R observation parameters and surface emissivity and surface aiding variable parameters.
9. The method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and an integrated machine learning algorithm according to claim 1, wherein the training process of the ET model inversion global vegetation optical thickness inversion model in step S8 is specifically implemented as follows:
s8.1, taking the divided training set as the input quantity of an ET model;
s8.2, inputting an image, earth surface reflectivity, GNSS-R observation value parameters and earth surface auxiliary variable parameters;
step S8.3, performing the most superparameter setting of the ET model: n_evators=10, default= "mse", default=2, max_features=none, max_dept=none, min_samples_split=2, bootstrapping=false, oob _score=true;
s8.4, inputting the SMAP VOD into an ET model for training;
and S8.5, inputting the test data set into the trained ET model after training is completed, obtaining an inversion vegetation optical thickness value, and evaluating the result.
CN202310918757.8A 2023-07-25 2023-07-25 Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm Active CN116881721B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310918757.8A CN116881721B (en) 2023-07-25 2023-07-25 Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310918757.8A CN116881721B (en) 2023-07-25 2023-07-25 Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm

Publications (2)

Publication Number Publication Date
CN116881721A true CN116881721A (en) 2023-10-13
CN116881721B CN116881721B (en) 2024-01-02

Family

ID=88269672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310918757.8A Active CN116881721B (en) 2023-07-25 2023-07-25 Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm

Country Status (1)

Country Link
CN (1) CN116881721B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090281981A1 (en) * 2008-05-06 2009-11-12 Chen Barry Y Discriminant Forest Classification Method and System
CN103149220A (en) * 2013-01-30 2013-06-12 中国科学院对地观测与数字地球科学中心 Soil moisture inversion method of mono-frequency microwave radiometer
CN103810387A (en) * 2014-02-13 2014-05-21 中国科学院地理科学与资源研究所 Earth face evapotranspiration remote sensing inversion method and system based on MODIS data
CN105842707A (en) * 2015-01-15 2016-08-10 兰州大学 Grassland above-ground biomass measuring method and grassland above-ground biomass measuring device based on remote sensing image acquired by unmanned aerial vehicle
US20170206415A1 (en) * 2016-01-15 2017-07-20 Blue River Technology Inc. Plant feature detection using captured images
US20170260711A1 (en) * 2016-03-14 2017-09-14 United States Of America As Represented By The Secretary Of The Army Photogrammetric soil density system and method
CN110186823A (en) * 2019-06-26 2019-08-30 中国科学院遥感与数字地球研究所 A kind of aerosol optical depth inversion method
CN111766577A (en) * 2020-07-27 2020-10-13 云南电网有限责任公司昆明供电局 Power transmission line channel tree height inversion method based on three-stage algorithm P wave band
CN114120140A (en) * 2021-11-18 2022-03-01 浙江大学德清先进技术与产业研究院 Method for automatically extracting building height based on satellite image
CN114241331A (en) * 2021-12-16 2022-03-25 中国科学院南京地理与湖泊研究所 Wetland reed aboveground biomass remote sensing modeling method taking UAV as ground and Sentinel-2 intermediary
CN114371182A (en) * 2022-03-22 2022-04-19 中国科学院地理科学与资源研究所 Satellite-borne GNSS-R high-precision soil moisture estimation method based on CYGNSS data
CN116127327A (en) * 2023-04-07 2023-05-16 广东省科学院广州地理研究所 Forest ground biomass inversion method, device, equipment and storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090281981A1 (en) * 2008-05-06 2009-11-12 Chen Barry Y Discriminant Forest Classification Method and System
CN103149220A (en) * 2013-01-30 2013-06-12 中国科学院对地观测与数字地球科学中心 Soil moisture inversion method of mono-frequency microwave radiometer
CN103810387A (en) * 2014-02-13 2014-05-21 中国科学院地理科学与资源研究所 Earth face evapotranspiration remote sensing inversion method and system based on MODIS data
CN105842707A (en) * 2015-01-15 2016-08-10 兰州大学 Grassland above-ground biomass measuring method and grassland above-ground biomass measuring device based on remote sensing image acquired by unmanned aerial vehicle
US20170206415A1 (en) * 2016-01-15 2017-07-20 Blue River Technology Inc. Plant feature detection using captured images
US20170260711A1 (en) * 2016-03-14 2017-09-14 United States Of America As Represented By The Secretary Of The Army Photogrammetric soil density system and method
CN110186823A (en) * 2019-06-26 2019-08-30 中国科学院遥感与数字地球研究所 A kind of aerosol optical depth inversion method
CN111766577A (en) * 2020-07-27 2020-10-13 云南电网有限责任公司昆明供电局 Power transmission line channel tree height inversion method based on three-stage algorithm P wave band
CN114120140A (en) * 2021-11-18 2022-03-01 浙江大学德清先进技术与产业研究院 Method for automatically extracting building height based on satellite image
CN114241331A (en) * 2021-12-16 2022-03-25 中国科学院南京地理与湖泊研究所 Wetland reed aboveground biomass remote sensing modeling method taking UAV as ground and Sentinel-2 intermediary
CN114371182A (en) * 2022-03-22 2022-04-19 中国科学院地理科学与资源研究所 Satellite-borne GNSS-R high-precision soil moisture estimation method based on CYGNSS data
CN116127327A (en) * 2023-04-07 2023-05-16 广东省科学院广州地理研究所 Forest ground biomass inversion method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XIAOQING ZUO 等: "The quality analysis of GNSS satellite positioning data", 《CLUSTER COMPUTING》, vol. 22, no. 3, pages 56693 *
布金伟 等: "GNSS多星定位数据的质量分析", 《昆明理工大学学报(自然科学版)》, vol. 42, no. 6, pages 24 *

Also Published As

Publication number Publication date
CN116881721B (en) 2024-01-02

Similar Documents

Publication Publication Date Title
CN111479231B (en) Indoor fingerprint positioning method for millimeter wave large-scale MIMO system
CN112488008B (en) Soil moisture measuring method and device based on multi-source remote sensing data fusion
CN104361338A (en) Peat bog information extracting method based on ENVISAT ASAR, Landsat TM and DEM data
CN108766203B (en) Compact polarization rice mapping method and system
Parida et al. Polarimetric decomposition methods for LULC mapping using ALOS L-band PolSAR data in Western parts of Mizoram, Northeast India
CN103837873A (en) Microwave and stare correlated imaging system and method based on floating platform and intensive array antennae
Liu et al. Estimation of vegetation parameters of water cloud model for global soil moisture retrieval using time-series L-band Aquarius observations
CN105447274A (en) Method of performing coastal wetland drawing for medium-resolution remote sensing image by utilizing object-oriented classification technology
CN113255874A (en) Optimized BP neural network-based soil moisture inversion method through microwave remote sensing
Candy et al. A Comparison of the impact of QuikScat and WindSat wind vector products on met office analyses and forecasts
CN110516552B (en) Multi-polarization radar image classification method and system based on time sequence curve
CN113534083B (en) SAR-based corn stubble mode identification method, device and medium
He et al. Object-based distinction between building shadow and water in high-resolution imagery using fuzzy-rule classification and artificial bee colony optimization
Yang et al. Mitigation of rain effect on wave height measurement using X-band radar sensor
CN116881721B (en) Method for inverting vegetation optical thickness by combining satellite-borne GNSS-R data and integrated machine learning algorithm
CN113960625A (en) Water depth inversion method based on satellite-borne single photon laser active and passive remote sensing fusion
CN111178186A (en) Rice extraction method, device and equipment based on sentinel remote sensing data
CN114545410B (en) Crop lodging monitoring method based on synthetic aperture radar dual-polarized data coherence
CN107507251B (en) Pseudo-color synthesis method and device of dual-polarization SAR image
Srivastava et al. Potential applications of multi-parametric synthetic aperture radar (SAR) data in wetland inventory: a case study of Keoladeo National Park (A World Heritage and Ramsar site), Bharatpur, India
CN114152936A (en) Satellite-borne waveform laser radar ground elevation precision evaluation method for forest research area
Wadoux et al. Shapley values reveal the drivers of soil organic carbon stocks prediction
Wang et al. Coastal Sea Surface Temperature Inversion from Microwave Radiometer using Radial Basis Function Neural Network
CN115546658B (en) Night cloud detection method combining quality improvement and CNN improvement of data set
Huang et al. Multi‐source data‐based method for retrieval of soil moisture in grassland

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant