GB2620469A - Spatial prediction and evaluation method of soil organic matter content based on partition algorithm - Google Patents
Spatial prediction and evaluation method of soil organic matter content based on partition algorithm Download PDFInfo
- Publication number
- GB2620469A GB2620469A GB2305398.6A GB202305398A GB2620469A GB 2620469 A GB2620469 A GB 2620469A GB 202305398 A GB202305398 A GB 202305398A GB 2620469 A GB2620469 A GB 2620469A
- Authority
- GB
- United Kingdom
- Prior art keywords
- soil
- organic matter
- partition
- soil organic
- images
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000004016 soil organic matter Substances 0.000 title claims abstract description 81
- 238000005192 partition Methods 0.000 title claims abstract description 69
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 59
- 238000011156 evaluation Methods 0.000 title claims abstract description 14
- 239000002689 soil Substances 0.000 claims abstract description 86
- 230000003595 spectral effect Effects 0.000 claims abstract description 40
- 238000000034 method Methods 0.000 claims abstract description 20
- 238000012549 training Methods 0.000 claims abstract description 13
- 238000007781 pre-processing Methods 0.000 claims description 18
- 239000005416 organic matter Substances 0.000 claims description 15
- KMUONIBRACKNSN-UHFFFAOYSA-N potassium dichromate Chemical compound [K+].[K+].[O-][Cr](=O)(=O)O[Cr]([O-])(=O)=O KMUONIBRACKNSN-UHFFFAOYSA-N 0.000 claims description 8
- 238000002310 reflectometry Methods 0.000 claims description 8
- 238000012795 verification Methods 0.000 claims description 8
- 238000010276 construction Methods 0.000 claims description 7
- 238000005070 sampling Methods 0.000 claims description 5
- 230000001419 dependent effect Effects 0.000 claims description 4
- 238000007637 random forest analysis Methods 0.000 claims description 4
- 230000011218 segmentation Effects 0.000 claims description 4
- 230000002194 synthesizing effect Effects 0.000 claims description 4
- 238000007605 air drying Methods 0.000 claims 1
- 239000012535 impurity Substances 0.000 claims 1
- 238000013507 mapping Methods 0.000 abstract description 3
- 238000004364 calculation method Methods 0.000 description 7
- 238000009826 distribution Methods 0.000 description 4
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 3
- 229910052799 carbon Inorganic materials 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000010365 information processing Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 101000845005 Macrovipera lebetina Disintegrin lebein-2-alpha Proteins 0.000 description 1
- 238000012271 agricultural production Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000009614 chemical analysis method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000035558 fertility Effects 0.000 description 1
- 235000021393 food security Nutrition 0.000 description 1
- 239000003041 laboratory chemical Substances 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 235000015097 nutrients Nutrition 0.000 description 1
- 238000001055 reflectance spectroscopy Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/255—Details, e.g. use of specially adapted sources, lighting or optical systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/27—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands using photo-electric detection ; circuits for computing concentration
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/24—Earth materials
- G01N33/245—Earth materials for agricultural purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/58—Extraction of image or video features relating to hyperspectral data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
- G06V10/763—Non-hierarchical techniques, e.g. based on statistics of modelling distributions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/766—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/194—Terrestrial scenes using hyperspectral data, i.e. more or other wavelengths than RGB
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N2021/1793—Remote sensing
- G01N2021/1797—Remote sensing in landscape, e.g. crops
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
- G06T2207/10036—Multispectral image; Hyperspectral image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30181—Earth observation
- G06T2207/30188—Vegetation; Agriculture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Chemical & Material Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Artificial Intelligence (AREA)
- Pathology (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Remote Sensing (AREA)
- Probability & Statistics with Applications (AREA)
- Quality & Reliability (AREA)
- Mathematical Physics (AREA)
- Environmental & Geological Engineering (AREA)
- General Life Sciences & Earth Sciences (AREA)
- Geology (AREA)
- Food Science & Technology (AREA)
- Medicinal Chemistry (AREA)
- Investigating Or Analysing Materials By Optical Means (AREA)
Abstract
Disclosed is a spatial prediction and evaluation method of soil organic matter content based on a partition algorithm. First soil samples are collected and pre-processed to produce soil sample data then images e.g. satellite images of the collection area are captured and pre-processed and are used to construct a spectral index. A soil organic matter prediction model is then produced based on a partition algorithm which puts forward a mapping method of soil organic matter content with high spatial resolution based on the spectral characteristics of synthetic images of bare soil in different periods and a forest regression model of the relationship between spectral characteristics and soil organic matter content is obtained by training using the measured soil samples.
Description
SPATIAL PREDICTION AND EVALUATION METHOD OF SOIL ORGANIC MATTER CONTENT BASED ON PARTITION ALGORITHM
TECHNICAL FIELD
The application relates to the technical field of soil organic matter content, and in particular to a spatial prediction and evaluation method of soil organic matter content based on a partition algorithm.
BACKGROUND
The organic matter is an important indicator of soil fertility and the main source of nutrients for crop growth, which is of great significance to agricultural production and food security. The conventional measurement of soil organic matter mainly depends on manual sampling and a large number of fixed-point observations. These methods are time-consuming and laborious, and usually may only obtain the point distribution data of soil organic matters, which may not reflect the overall situation of regional distribution. Therefore, combining spatial information technology to obtain fine spatial distribution information of soil organic matter is the future development direction.
Because of the high real-time and easy acquisition of remote sensing images, the spatial distribution mapping of soil organic matter using remote sensing images has been widely used. There is a correlation between soil organic matter content and spectral reflectance, which provides a theoretical basis for spatial soil organic matter mapping using multi-spectral images. However, the current prediction of soil organic matter content based on reflectance spectroscopy has the following problems: at present, spectral reflectance, first-order differential of reflectance, reciprocal logarithm of reflectance, first-order differential of reciprocal logarithm of reflectance and other spectral reflectance changes are often used as estimation model parameters; (2) at present, neural network, continuous evolution architecture search (CARS) algorithm, Bat algorithm (BA), multiple linear regression and other methods are used to estimate soil organic matter content. These model parameters and methods are relatively complex, with many calculation parameters and long calculation time, so it may be difficult to apply them in practice. Moreover, most of these methods are based on the whole spectral image to estimate the content, and there is a risk that the model prediction will be greatly deviated due to the large difference of soil in the target area. Therefore, there is an urgent need for a new method for estimating soil organic matter content with simple parameters and easy modeling method, which may partition the target area.
Aiming at the above problems, a design is carried out on the basis of the original estimation method of soil organic matter.
SUMMARY
The application aims to provide a spatial prediction and evaluation method of soil organic matter content based on the partition algorithm, which avoids the time-consuming and laborious shortage of conventional laboratory chemical analysis methods. Compared with other spectral technologies, the partition algorithm may avoid prediction errors caused by insufficient data and regional microenvironment differences, and improve the accuracy of soil organic matter content prediction. In addition, the partition algorithm accelerates the response time of image spectral information processing, greatly improves the calculation speed, and may greatly reduce the calculation time and time consumption.
In order to achieve the above objectives, the application provides the following technical scheme: a spatial prediction and evaluation method of soil organic matter content based on partition algorithm, specifically including following steps: step A, collecting and preprocessing soil sample data; step B, collecting and preprocessing images; step C, constructing a spectral index; and step D, constructing a soil organic matter prediction model based on a partition algorithm.
Preferably, collecting and preprocessing soil sample data in the step A is to divide a target agricultural land into a plurality of 30 metersx30 meters grids, collect 5-6 sub-samples in each grid and mix the sub-samples into a mixed sample, and record central positions of each grid by a hand-held global position system(GPS). Sampling points include all soil types in the target agricultural land.
By adopting the above technical scheme, each mixed sample is naturally air-dried, impurity-removed, ground and passed through a 2 millimeter sieve, and then the total organic carbon (TOC) content in the sample is determined by methods such as potassium dichromate method, and then the TOC content is multiplied by a conversion factor to obtain the soil organic matter content.
Preferably, collecting and preprocessing images in the step B includes selecting atmospheric corrected surface reflectance maps (SR) of Landsat-8 or Sentinel-2 of the target agricultural land in all bare soil periods as original images through Google Earth Engine (GEE) platform; synthesizing multi-period SR images, and obtaining relatively stable soil pixels from the synthesized images to predict the organic matter content; Preferably, the Landsat 8 image preprocessing method: (1) Landsat-8 data are stored in a USGS Landsat-8 SR database in the GEE platform, including 5 visible light and VNIR bands, 2 SWIR bands and 1 thermal infrared band; the bands are atmospherically corrected by a LaSRC algorithm, and a spatial resolution is 30 meters; then a pixel_QA band of Landsat-8 SR product is used as a cloud mask to generate a Landsat-8 image without a cloud cover in the bare soil period, and all the Landsat-8 SR images in the bare soil period are synthesized, so that the relatively stable soil pixels may be obtained to predict the organic matter content Preferably, the Sentinel-2 image preprocessing method: (2) the Sentinel-2 data is stored in a Sentinel-2 MS I Level-2A database in the GEE platform, including 4 VNIR bands, 2 SWIR bands and 4 red edge bands; the bands are corrected by a Sen2Cor algorithm, and a spatial resolution of VNIR band is 10 meters, and the spatial resolutions of the SWIR bands and the red edge bands are all 20 meters; then a QA60 band of Sentinel-2 SR product is used as the cloud mask to generate Sentinel-2 images with no cloud cover in the bare soil period; ,the relatively stable soil pixels may be obtained to predict the organic matter content by synthesizing all the Sentinel-2 SR images in the bare soil period.
Preferably, constructing a spectral index in the step C includes constructing the spectral indexes of normalized difference index (NDI), ratio index (RI) and difference index (DI) to predict the soil organic matter content, NDI, RI and DI may provide more information than other indexes for multi spectral images with a few spectral bands; NDI, RI and DI are calculated based on following formulae: NDI = (Pi -])/(] +I) RI = I P DI=] -J wherein wherein P, represents a reflectivity of the i-th band and p, represents a reflectivity of the j-th band; and the required spectral index is generated for Landsat-8 synthetic images or Sentinel-2 synthetic images by using the above formulae.
Preferably, the local regression model based on soil type partition is used in the construction of soil organic matter prediction model based on partition algorithm in the step D. According to the results of soil survey in China, the target agricultural land is divided into different soil types, and the organic matter content of different types of soil is predicted. Combining the predicted results with the measured results of soil in the step A, a local regression model based on soil type partition is obtained.
Preferably, the local regression model based on IC-means partition is used in the construction of soil organic matter prediction model based on partition algorithm in the step D. The cascade simple K-means algorithm built in GEE platform is used to select the best partition data. The synthetic image to be segmented and the measured soil sample data in the step A are input into GEE as training samples, and the soil organic matter content in different partitions is predicted. The local regression model based on K-means partition is obtained by combining the predicted results with the measured soil results.
Preferably, the random forest regression model is used in the construction of soil organic matter prediction model based on partition algorithm in the step D is calculated by using the RF algorithm built in GEE platform. Taking all bands and spectral indexes of Landsat-8 or Sentinel-2 as independent variables, the content of soil organic matter is defined as dependent variables; using bootstrap to randomly select a certain number of samples from the original soil data set to produce a new training data set; then, establishing each tree model, and determining an optimal number of segmentation nodes by an e or, and predicted average values of all the trees are final predicted values Preferably, an accuracy verification of the soil organic matter prediction model based on the partition algorithm is to evaluate a prediction accuracy of a soil organic matter estimation model by taking 75% of the soil samples in the step A as training samples and 25% as verification samples, and by determining coefficient (R2) and root mean square error (RMSE) between the predicted values and the measured values of the model; the greater R2 is, the smaller RMSE is, and the higher the prediction accuracy of the soil organic matter prediction model is R2 and RMSE are calculated as follows: (y1-.932 R = 1 RAISE = 1 ^ ^ 2 -2.d Y) n where Ft is a number of samples, yi is the measured value of soil organic matter observed by sample i, and 9 is a predicted value of soil organic matter calculated by sample i model.
Compared with the prior art, the application has the following beneficial effects.
First, GEE platform may synchronize products of different periods and levels produced by United States Geological Survey (USGS) and European Space Agency (ESA), and may provide multi-temporal spectral image processing function. Its powerful data parallel processing ability may greatly simplify and accelerate the original very complicated calculation and operation steps.
Second, the partition algorithm may avoid the prediction error caused by insufficient data and regional microenvironment differences, and improve the prediction accuracy of soil organic matter content. In addition, the partition algorithm accelerates the response time of image spectral information processing, greatly improves the calculation speed, and may greatly reduce the calculation time and time consumption.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a curve showing the variation of the average value of soil organic matter with spectral bands in different partition algorithms of Landsat-8 of the present application FIG. 2 is a curve showing the variation of the average value of soil organic matter with spectral bands in different partition algorithms of Sentinel-2 of the present application.
FIG. 3 is a diagram showing the verification accuracy of the regression model of soil organic matter in different regions of Landsat-8 according to the embodiment of the present application.
FIG. 4 is a diagram showing the verification accuracy of the regression model of soil organic matter in different regions of Sentinel-2 according to the embodiment of the present application.
FIG. 5 is a flowchart of the spatial prediction and evaluation method of soil organic matter content based on partition algorithm of the present application.
DETAILED DESCRIPTION OF THE EMBODIMENTS
In the following, the technical scheme in the embodiment of the application will be clearly and completely described with reference to the attached drawings. Obviously, the described embodiments are only a part of the embodiments of the application, but not all embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in the field without creative labor belong to the scope of protection of the present application.
Embodiment 1 Aspatial prediction and evaluation method of soil organic matter content based on a partition algorithm, specifically including following steps: step A, collecting and preprocessing soil sample data; step B, collecting and preprocessing of Landsat-8 images; step C, constructing a spectral index; and step D, constructing a soil organic matter prediction model based on a partition algorithm.
Collecting and preprocessing soil sample data in the step A is to divide a target agricultural land into a plurality of 30 metersx30 meters grids, collect 5-6 sub-samples in each grid and mix the sub-samples into a mixed sample, and record central positions of each grid by a hand-held global position system(GPS); sampling points include all soil types in the target agricultural land; each mixed sample is naturally air-dried, impurity-removed, ground and passed through a mesh with a diameter of 2 millimeters, and then the total organic carbon (TOC) content in the sample is determined by methods such as the potassium dichromate method.
Collecting and preprocessing images in the step B includes selecting atmospheric corrected surface reflectance maps (SR) of Landsat-8 of the target agricultural land in all bare soil periods as original images through Google Earth Engine (GEE) platform. Multi-period SR images are synthesized, to obtain relatively stable soil pixels from the synthesized images to predict the organic matter content; Landsat-8 data are stored in a USGS Landsat-8 SR database in the GEE platform, including 5 visible light and VN1R bands, 2 SWIR bands and 1 thermal infrared band; the bands are atmospherically corrected by a LaSRC algorithm, and a spatial resolution is 30 meters; then a pixel QA band of Landsat-8 SR product is used as a cloud mask to generate a Landsat-8 image without a cloud cover in the bare soil period, and all the Landsat-8 SR images in the bare soil period are synthesized, so that the relatively stable soil pixels may be obtained to predict the organic matter content.
Constructing a spectral index in the step C includes: constructing the spectral indexes of normalized difference index (NDI), ratio index (RI) and difference index (DI) to predict the soil organic matter content; NDI, RI and DI may provide more information than other indexes for multi spectral images with a few spectral bands.
NDI, RI and DI are calculated based on following formulae: ND! = (P, -P i)1(P, + P1) RI = P, I P DI = 13, -P wherein P, represents a reflectivity of the i-th band and P1 represents a reflectivity of the j-th band.
Constructing a soil organic matter prediction model based on a partition algorithm in the step D, includes: (I) The partition algorithm is based on the local regression model of soil type partition. According to the results of soil survey in China, the target agricultural land is divided into different soil types, and the organic matter content of different types of soil is predicted. Combining the predicted results with the measured results of soil in the step A, a local regression model based on soil type partition is obtained.
(2) The partition algorithm is based on the local regression model of IC-means partition, which uses the cascade simple K-means algorithm in GEE platform to select the best partition data. The synthetic image to be segmented and the measured soil sample data in the step A are input into GEE as training samples, and the soil organic matter content in different partitions is predicted. The local regression model based on K-means partition is obtained by combining the predicted results with the measured soil results.
(3) The soil organic matter prediction model is random forest regression model, which is calculated by RF algorithm built in GEE platform. Taking all bands and spectral indexes of Landsat-8 as independent variables, the content of soil organic matter is defined as dependent variables; using bootstrap to randomly select a certain number of samples from the original soil data set to produce a new training data set; then, establishing each tree model, and determining an optimal number of segmentation nodes by an error, and predicted average values of all the trees are final predicted values.
(4) Check the model Taking 75% of the soil samples as training samples and 25% as verification samples, the prediction accuracy of a soil organic matter estimation model is evaluated by determining coefficient (R2) and root mean square error (RMSE) between the predicted values and the measured values of the model. The greater R2 is, the smaller RMSE is, and the higher the prediction accuracy of the soil organic matter prediction model is, R2 and RMSE are calculated as follows: 02,-J9J2 R =1 0), -i = 1 i =1 1 rn RMSE= \n where ii is a number of samples, yi is the measured value of soil organic matter observed by sample i, and 32; is a predicted value of soil organic matter calculated by a sample i model.
Embodiment 2 A spatial prediction and evaluation method of soil organic matter content based on a partition algorithm, specifically including following steps: step A, collecting and preprocessing soil sample data; step B, collecting and preprocessing of Sentinel-2 images; step C, constructing a spectral index; and step D, constructing a soil organic matter prediction model based on a partition algorithm.
Collecting and preprocessing soil sample data in the step A is to divide a target agricultural land into a plurality of 30 metersx30 meters grids, collect 5-6 sub-samples in each grid and mix the sub-samples into a mixed sample, and record a central position of each grid by a hand-held GPS; sampling points include all soil types in the target agricultural land; each mixed sample is naturally air-dried, impurity-removed, ground and passed through a mesh with a diameter of 2 millimeters, and then the total organic carbon (TOC) content in the sample is determined by methods such as the potassium dichromate method.
Collecting and preprocessing images in the step B includes selecting atmospheric corrected surface reflectance maps (SR) of Sentinel-2 of the target agricultural land in all bare soil periods as original images through Google Earth Engine (GEE) platform. Multi-period SR images are synthesized, to obtain relatively stable soil pixels from the synthesized images to predict the organic matter content; Sentinel-2 data are stored in a USGS Sentinel-2 MSI Level-2A database in the GEE platform, including 4 VNIR bands, 2 SWIR bands and 1 red edge band; the bands are atmospherically corrected by a Sen2Cor algorithm, and a spatial resolution is 10 meters; the spatial resolution of SWIR band and red edge band is 20 m; then a QA60 band of Sentinel-2 SR product is used as a cloud mask to generate a Sentinel-2 image without a cloud cover in the bare soil period, and all the Sentinel-2 SR images in the bare soil period are synthesized, so that the relatively stable soil pixels may be obtained to predict the organic matter content.
Constructing a spectral index in the step C includes: constructing the spectral index such as normalized difference index (NDI), ratio index (RI) and difference index (DI) to predict the soil organic matter content; NDI, RI and DI may provide more information than other indexes for multi spectral images with few spectral bands; NDI, RI and DI are calculated as the following formulae: NDI = (P, -P i) I( P, + Pi) RI = 131131 DI = P1-Pi, where P, represents a reflectivity of the i-th band and P, represents a reflectivity of the j-th band; the required spectral index is generated for Sentinel-2 synthetic images by using the above formulae.
Constructing a soil organic matter prediction model based on a partition algorithm in the step D, includes: (I) The partition algorithm is based on the local regression model of soil type partition According to the results of soil survey in China, the target agricultural land is divided into different soil types, and the organic matter content of different types of soil is predicted Combining the predicted results with the measured results of soil in the step A, a local regression model based on soil type partition is obtained.
(2) The partition algorithm is based on the local regression model of K-means partition, which uses the cascade simple K-means algorithm built in GEE platform to select the best partition data. The synthetic image to be segmented and the measured soil sample data in the step A are input into GEE as training samples, and the soil organic matter content in different partitions is predicted. The local regression model based on K-means partition is obtained by combining the predicted results with the measured soil results.
(3) The soil organic matter prediction model is random forest regression model, which is calculated by RF algorithm built in GEE platform. Taking all bands and spectral indexes of Landsat-8 as independent variables, the content of soil organic matter is defined as dependent variables; using bootstrap to randomly select a certain number of samples from the original soil data set to produce a new training data set; then, establishing each tree model, and determining an optimal number of segmentation nodes by an error,and predicted average values of all the trees are final predicted values.
(4) Check the model Taking 75% of the soil samples as training samples and 25% as verification samples, the prediction accuracy of a soil organic matter estimation model is evaluated by determining coefficient (R2) and root mean square error (RMSE) between the predicted values and the measured values of the model. The greater R2 is, the smaller RMSE is, and the higher the prediction accuracy of the soil organic matter prediction model is. R2 and RMSE are calculated as follows: R = 1 -7)2 RAISE = 1 ^ ^2 - Y) n where Ft is a number of samples, y, is the measured value of soil organic matter observed by sample i, and 9, is a predicted value of soil organic matter calculated by sample i model.
Although the present application has been described in detail with reference to the foregoing embodiments, it is still possible for a person skilled in the art to modify the technical scheme described in the foregoing embodiments or to replace some technical features by equivalents. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should fall in the protection scope of the present application. i=1
Claims (9)
- Claims 1. A spatial prediction and evaluation method of soil organic matter content based on a partition algorithm, comprising following steps: step A, collecting and preprocessing soil sample data; step B, collecting and preprocessing images; step C, constructing a spectral index; and step D, constructing a soil organic matter prediction model based on the partition algorithm.
- 2. The spatial prediction and evaluation method of soil organic matter content based on a partition algorithm according to claim 1, wherein collecting and preprocessing soil sample data in the step A is to divide a target agricultural land into a plurality of 30 meters/30 meters grids, collect 5-6 sub-samples in each grid and mix the sub-samples into a mixed sample, and record central positions of each grid by a hand-held global position system(GPS); sampling points comprise all soil types in the target agricultural land; and natural air drying and impurity removal are adopted for each mixed sample; the samples are ground through a screen with a mesh diameter of 2 millimeters, and then an organic content in the sample is determined by a potassium dichromate method.
- 3 The method for spatial prediction and evaluation of soil organic matter content based on a partition algorithm according to claim 1, wherein collecting and preprocessing images in the step B comprises selecting atmospheric corrected surface reflectance maps (SR) of Landsat-8 or Sentinel-2 of the target agricultural land in all bare soil periods as original images through Google Earth Engine (GEE) platform; and synthesizing multi-period SR images, and obtaining relatively stable soil pixels from the synthesized images to predict the organic matter content; wherein (1) Landsat-8 data are stored in a USGS Landsat-8 SR database in the GEE platform, comprising 5 visible light and VNIR bands, 2 SWIR bands and 1 thermal infrared band; the bands are atmospherically corrected by a LaSRC algorithm, and a spatial resolution is 30 meters, then a pixel QA band of a Landsat-8 SR product is used as a cloud mask to generate a Landsat-8 image without a cloud cover in the bare soil period, and all the Landsat-8 SR images in the bare soil period are synthesized, so the relatively stable soil pixels may be obtained to predict the organic matter content, (2) the Sentinel-2 data is stored in a Sentinel-2 MSI Level-2A database in the GEE platform, comprising 4 VN1R bands, 2 SW1R bands and 4 red edge bands; the bands are corrected by a Sen2Cor algorithm, and a spatial resolution of VNIR band is 10 meters, and the spatial resolutions of the SW1R bands and the red edge bands are all 20 meters, and then a QA60 band of a Sentinel-2 SR product is used as the cloud mask to generate Sentinel-2 images with no cloud cover in the bare soil period; the relatively stable soil pixels may be obtained to predict the organic matter content by synthesizing all the Sentinel-2 SR images in the bare soil period
- 4. The method for spatial prediction and evaluation of soil organic matter content based on a partition algorithm according to claim 1 wherein constructing a spectral index in the step C comprises: constructing the spectral indexes of normalized difference index (NDI), ratio index (RI) and difference index (DI) to predict the soil organic matter content; NM, RI and DI may provide more information than other indexes for multi spectral images with a few spectral bands; NDI, RI and DI are calculated based on following formulae: NDI = -P + P RI = P, I 13, DI =I -; wherein P, represents a reflectivity of the i-th band and P., represents a reflectivity of the j-th band, and the required spectral index is generated for Landsat-8 synthetic images or Sentinel-2 synthetic images by using the above formulae.
- 5. The method for spatial prediction and evaluation of soil organic matter content based on a partition algorithm according to claim 1, wherein constructing a soil organic matter prediction model based on the partition algorithm in the step D, (1) the partition algorithm may be a local regression model based on soil type partition; (2) the partition algorithm may be a local regression model based on K-means partition and (3) the soil organic matter prediction model is a random forest regression model.
- 6. A construction of soil organic matter prediction model based on the partition algorithm according to claim 5, wherein the (1) comprises dividing the target agricultural land into different soil types according to results of soil survey in China based on the local regression model of the soil type partition, predicting the organic matter content of different types of soil, and combining predicted results with the measured soil results in the step A to obtain the local regression model based on the soil type partition.
- 7. The construction of soil organic matter prediction model based on the partition algorithm according to claim 5, wherein the (2) comprises using a cascade simple K-means algorithm built in GEE platform to select best partition data; inputting the synthetic images to be segmented and measured soil sample data in the step A into the GEE as training samples, predicting the soil organic matter content in different partitions, combining the predicted results with the measured soil results, and obtaining the local regression model based on the K-means partition.
- 8. The construction of soil organic matter prediction model based on the partition algorithm according to claim 5, wherein the (3) comprises calculating by using RF algorithm built in the GEE platform; taking all bands and spectral indexes of Landsat-8 or Sentinel-2 as independent variables, and defining the content of soil organic matter as dependent variables, using bootstrap to randomly select a certain number of samples from the original soil data set to produce a new training data set; and then establishing each tree model, and determining an optimal number of segmentation nodes by an error, and determining predicted average values of all the trees as final predicted values.
- 9. The construction of the soil organic matter prediction model based on the partition algorithm according to claim 5, wherein an accuracy verification of the soil organic matter prediction model based on the partition algorithm is to evaluate a prediction accuracy of a soil organic matter estimation model by taking 75% of the soil samples in the step A as the training samples and 25% as verification samples, and by determining coefficient (R2) and root mean square error (RMSE) between the predicted values and the measured values of the model; and the greater R2 is, the smaller RMSE is, and the higher the prediction accuracy of the soil organic matter prediction model is; and R2 and RMSE are calculated based on following formulae: E (y, -y)2 RMSE -1111(y _5)2 n wherein n is a number of samples, yi is the measured value of soil organic matter observed by a sample i, and 9; is a predicted value of soil organic matter calculated by a sample i model.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210757738.7A CN115128013A (en) | 2022-06-30 | 2022-06-30 | Soil organic matter content space prediction evaluation method based on partition algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
GB202305398D0 GB202305398D0 (en) | 2023-05-24 |
GB2620469A true GB2620469A (en) | 2024-01-10 |
Family
ID=83382206
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB2305398.6A Pending GB2620469A (en) | 2022-06-30 | 2023-04-12 | Spatial prediction and evaluation method of soil organic matter content based on partition algorithm |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115128013A (en) |
GB (1) | GB2620469A (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117668476A (en) * | 2023-12-07 | 2024-03-08 | 电子科技大学 | Soil carbonate prediction method based on near infrared spectrum and migration learning |
CN117688478B (en) * | 2023-12-11 | 2024-09-24 | 中国科学院地理科学与资源研究所 | Multi-cycle classification-based farmland soil salinity inversion method |
CN117829376B (en) * | 2024-02-29 | 2024-07-19 | 中国科学院地理科学与资源研究所 | Evaluation method and system for sustainable utilization of cultivated land resources in black soil area |
CN117852775B (en) * | 2024-03-05 | 2024-05-28 | 中国科学院地理科学与资源研究所 | Assessment method for karst carbon sink potential and related equipment |
CN118279431B (en) * | 2024-06-04 | 2024-08-23 | 中国农业科学院农业资源与农业区划研究所 | Crop mapping method and system with large area and low sample dependence |
CN118350718B (en) * | 2024-06-13 | 2024-08-27 | 武汉雷特科技有限公司 | Ecological soil quality management method and system based on GIS and automation technology |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2020104068A4 (en) * | 2020-12-14 | 2021-02-25 | Shihezi University | Method for zone-based management of soil nutrients of cultivated land based on geographic information system (gis) and remote sensing (rs) |
US20210209803A1 (en) * | 2020-01-06 | 2021-07-08 | Quantela Inc | Computer-based method and system for geo-spatial analysis |
-
2022
- 2022-06-30 CN CN202210757738.7A patent/CN115128013A/en active Pending
-
2023
- 2023-04-12 GB GB2305398.6A patent/GB2620469A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210209803A1 (en) * | 2020-01-06 | 2021-07-08 | Quantela Inc | Computer-based method and system for geo-spatial analysis |
AU2020104068A4 (en) * | 2020-12-14 | 2021-02-25 | Shihezi University | Method for zone-based management of soil nutrients of cultivated land based on geographic information system (gis) and remote sensing (rs) |
Non-Patent Citations (2)
Title |
---|
"Estimation of soil organic matter content based on CARS algorithm coupled with random forest", Liu JDong ZXia JWang HMeng TZhang RHan JWang NXie J, Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2021-04-20, ELSEVIER, AMSTERDAM, NL, Vol 258, Article No: 119823 * |
"Spatial prediction of soil organic matter content using multiyear synthetic images and partitioning algorithms",Luo ChongWang YiangZhang XinleZhang WenqiLiu Huanjun, Catena, 2022-01-17, ELSEVIER, AMSTERDAM, NL, Vol 211, Article No: 106023 * |
Also Published As
Publication number | Publication date |
---|---|
GB202305398D0 (en) | 2023-05-24 |
CN115128013A (en) | 2022-09-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
GB2620469A (en) | Spatial prediction and evaluation method of soil organic matter content based on partition algorithm | |
CN109884664B (en) | Optical microwave collaborative inversion method and system for urban overground biomass | |
Jin et al. | A review of data assimilation of remote sensing and crop models | |
CN111598019B (en) | Crop type and planting mode identification method based on multi-source remote sensing data | |
CN112213287B (en) | Coastal beach salinity inversion method based on remote sensing satellite image | |
Halme et al. | Utility of hyperspectral compared to multispectral remote sensing data in estimating forest biomass and structure variables in Finnish boreal forest | |
Sarker et al. | Improved forest biomass estimates using ALOS AVNIR-2 texture indices | |
CN115481368B (en) | Vegetation coverage estimation method based on full remote sensing machine learning | |
CN112395808A (en) | Biomass remote sensing mapping method combining random forest and collaborative kriging | |
CN106501186B (en) | A kind of soil moisture content product NO emissions reduction method | |
Wang et al. | Landscape-level vegetation classification and fractional woody and herbaceous vegetation cover estimation over the dryland ecosystems by unmanned aerial vehicle platform | |
Mohammadi et al. | Modeling biophysical properties of broad-leaved stands in the hyrcanian forests of Iran using fused airborne laser scanner data and ultraCam-D images | |
CN112861435B (en) | Mangrove quality remote sensing inversion method and intelligent terminal | |
CN112348812A (en) | Forest stand age information measuring method and device | |
CN111667183A (en) | Method and system for monitoring cultivated land quality | |
CN114460013B (en) | Coastal wetland vegetation overground biomass GAN model self-learning remote sensing inversion method | |
CN110110025B (en) | Regional population density simulation method based on feature vector space filtering value | |
CN114372707A (en) | High-cold-wetland degradation degree monitoring method based on remote sensing data | |
CN113205014B (en) | Time sequence data farmland extraction method based on image sharpening | |
CN113436153A (en) | Method for predicting carbon components of undisturbed soil profile based on hyperspectral imaging and support vector machine technology | |
Jiang et al. | Monitoring the coastal environment using remote sensing and GIS techniques | |
AU2021101780A4 (en) | Aboveground Biomass Estimation and Scale Conversion for Mean Regional Spectral Units | |
Ayub et al. | Wheat Crop Field and Yield Prediction using Remote Sensing and Machine Learning | |
Lin et al. | A model for forest type identification and forest regeneration monitoring based on deep learning and hyperspectral imagery | |
Zhang et al. | In-season mapping of rice yield potential at jointing stage using Sentinel-2 images integrated with high-precision UAS data |