CN115310719B - Farmland soil sampling scheme design method based on three-stage k-means - Google Patents
Farmland soil sampling scheme design method based on three-stage k-means Download PDFInfo
- Publication number
- CN115310719B CN115310719B CN202211125514.0A CN202211125514A CN115310719B CN 115310719 B CN115310719 B CN 115310719B CN 202211125514 A CN202211125514 A CN 202211125514A CN 115310719 B CN115310719 B CN 115310719B
- Authority
- CN
- China
- Prior art keywords
- farmland
- sub
- soil
- spatial
- sampling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 238000005527 soil sampling Methods 0.000 title claims abstract description 68
- 238000013461 design Methods 0.000 title claims abstract description 21
- 239000002689 soil Substances 0.000 claims abstract description 83
- 238000004519 manufacturing process Methods 0.000 claims abstract description 3
- 238000005070 sampling Methods 0.000 claims description 61
- 238000009826 distribution Methods 0.000 claims description 19
- 238000012545 processing Methods 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000004069 differentiation Effects 0.000 claims description 5
- 239000000523 sample Substances 0.000 description 20
- 235000015097 nutrients Nutrition 0.000 description 12
- 238000011160 research Methods 0.000 description 8
- 238000001514 detection method Methods 0.000 description 7
- 230000001419 dependent effect Effects 0.000 description 6
- 238000011161 development Methods 0.000 description 5
- 238000013507 mapping Methods 0.000 description 5
- 150000001875 compounds Chemical class 0.000 description 4
- 230000007613 environmental effect Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 239000012773 agricultural material Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000004720 fertilization Effects 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 238000011158 quantitative evaluation Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000012271 agricultural production Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000003337 fertilizer Substances 0.000 description 1
- 244000037666 field crops Species 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000000575 pesticide Substances 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000005068 transpiration Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/02—Agriculture; Fishing; Forestry; Mining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/188—Vegetation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Remote Sensing (AREA)
- Tourism & Hospitality (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Entrepreneurship & Innovation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Development Economics (AREA)
- Agronomy & Crop Science (AREA)
- Animal Husbandry (AREA)
- Marine Sciences & Fisheries (AREA)
- Mining & Mineral Resources (AREA)
- Primary Health Care (AREA)
- Astronomy & Astrophysics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a farmland soil sampling scheme design method based on three-stage k-means, which comprises the following steps: s1, forming a farmland sub-region with spatial diversity by using a k-means method and a geographical detector method based on DEM data, and distributing the soil sampling number of the farmland sub-region according to an area proportion; s2, based on NDVI data, subdividing the interior of a farmland sub-region obtained by k-means in a first stage by using a local coefficient of variation CV and a k-means method to form a series of sub-patches with similar local spatial variation levels, and obtaining the soil sampling number of each sub-patch according to the area and the local spatial variation levels; and S3, determining the spatial position of the representative soil sample by using a k-means method and a variance statistical means based on remote sensing estimated production data. The method comprehensively considers the spatial diversity and the local spatial variation level of the soil, and can more scientifically and reasonably design the farmland soil sampling scheme so as to obtain the soil spatial data with good quality.
Description
Technical Field
The invention belongs to the technical field of geographic information systems, and discloses a farmland soil sampling scheme design method based on three-stage k-means.
Background
China lives nearly 20% of the world population in land which occupies about 7% of the cultivated land area of the world. However, in recent years, china has many problems (high agricultural production cost, unreasonable use of chemical fertilizers or pesticides, low agricultural mechanization degree, soil hardening and the like) in the modern agricultural development process, so that the contradiction between population and environmental safety and food safety is increasingly prominent. The precision agriculture is a new trend of modern agriculture, is a key means for converting the resource input type into the scientific and technological type in the modern agriculture, is a mark for combining agricultural informatization, mechanization and modernization, and is helpful for solving the contradiction. The country also has paid great attention to the development problem of precision agriculture. In recent years, with the continuous acceleration of the urbanization process, many farmers enter city service workers, so that rural land is left unused, the intensive rural land management becomes a new era trend, and a new opportunity is provided for the smooth development and implementation of precision agriculture. The specific meaning of precision agriculture is that the agricultural material investment is reasonably adjusted according to farmland soil nutrients, soil nutrients required by growth of planted crops and target yield, and the purposes of improving the agricultural material utilization rate and agricultural productivity, saving resources, protecting agricultural environment and the like are achieved on the premise of meeting the growth needs of the crops.
Soil is an important component of the earth's surface system and has spatial diversity. The spatial incompatibility of the soil nutrient supply with the crop growth requirements is a fundamental cause of crop yield. Therefore, the space diversity condition of farmland soil nutrients is fully known, and the development goal of precision agriculture is facilitated. At present, the main method for mastering the spatial distribution of soil nutrients is to collect field soil samples of farmlands and perform physical and chemical analysis in a laboratory to determine the content of soil nutrients. However, large-scale, intensive collection of soil samples can consume a great deal of manpower, material resources, and financial resources. Soil sampling is a scientific method for estimating the spatial distribution of farmland soil nutrients by acquiring a small amount of key soil sampling point nutrient information and further using a certain soil prediction model, so that the sampling cost and the expected soil prediction precision can be well balanced. Different sampling schemes will obtain different spatial distribution characteristics of farmland soil nutrients. The research and the selection of a reasonable soil sampling method can be a powerful guarantee for obtaining reliable farmland soil nutrient space distribution.
At present, the soil sampling method is mainly divided into a sampling method based on design and a sampling method based on a model. The soil sampling method based on design is characterized in that the optimal sample size is determined quantitatively according to prior knowledge, and then the sample layout is determined by means of random sampling, system sampling, layered sampling and the like. For example, wang jinfeng et al propose a "sandwich" spatial sampling model based on hierarchical sampling, and considering spatial diversity. In addition, partial research utilizes the environment variable to assist in designing a sampling scheme according to the cooperative change relationship between the soil and the environment variable so as to improve the soil sampling efficiency. The model-based soil sampling method is based on geostatistical theory, and the implementation can be accurately fitted to the population through certain calibration. The method designs the optimal number of sampling points and spatial distribution pattern by minimizing the model estimation variance. For example, wang Shahua et al combines the fuzzy set theory and the sampling inspection theory, and proposes a two-stage sampling model based on spatial data quality inspection. In addition to the soil sampling method described above, recent research dynamics include a sample supplement method based on a spatial estimation uncertainty, a sampling method considering accessibility, and the like.
Compared with the soil sampling method based on design, the soil sampling method based on the model can fully excavate the structural information of the soil in the research area, thereby more acquiring the soil information of key points. However, most model-based soil sampling methods are based on geostatistical theory, and the design of sampling schemes using the method of the type depends on a variation function, and the variation function can be known only after sampling, or the soil data in front of the research area is used for modeling the variation function. Agricultural areas may have large differences in the spatial distribution of soil between different years due to the strong interference of human activities. In this case, the coefficient of variation modeling cannot be performed using the soil data before the agricultural area. Therefore, design-based soil sampling methods may be more suitable for application in precision agriculture. Different soil nutrients absorbed by different crop types are different, the complex planting structure of an agricultural area and artificial fertilization management can lead the soil in the area to have certain local spatial diversity, a new soil sampling method based on design is urgently needed to be developed, the local spatial diversity of the soil in the agricultural area is considered to identify and obtain the soil information of key points, and the precision of accurate agricultural digital soil mapping is further improved.
Disclosure of Invention
Aiming at the technical problems, the invention provides a farmland soil sampling scheme design method based on three-stage k-means, which aims to deduce the local spatial diversity of soil by excavating environmental information closely related to soil and utilizing a three-stage k-means method, determine the spatial position of a representative sample and form a farmland soil sampling scheme. In the first stage, k-means processing is carried out, a farmland subregion with space diversity is constructed, and the soil sampling quantity is reasonably distributed; a second stage of k-means processing, forming sub-patches with similar soil local space variation levels and acquiring the sampling number of the corresponding sub-patches; the third stage, k-means processing, determines the spatial location of the representative sample.
A farmland soil sampling scheme design method based on three-stage k-means comprises the following steps:
s1, forming a farmland sub-region with spatial diversity by using a k-means method and a geographical detector method based on DEM data, and distributing the soil sampling number of the farmland sub-region according to an area proportion to realize k-means processing in a first stage;
s1 specifically comprises the following substeps:
s11, constructing a farmland sub-region with space diversity by using a k-means method:
dividing the farmland based on DEM data by using a k-means method to form farmland sub-regions with similar terrain conditions in the regions and different terrain conditions among the regions, namely having space diversity;
s12, determining the optimal number of farmland sub-regions by using a geographic detector:
based on DEM data, detecting the spatial differentiation conditions of the farmland sub-regions under different quantities by using the Q value in the geographic detector to obtain a change curve of the Q value, and selecting the quantity corresponding to the inflection point as the optimal quantity of the farmland sub-regions;
s13, distributing the soil sampling quantity of each farmland subregion:
the design of the soil sampling scheme comprises the steps of firstly determining the sample size and then determining the sample position; and calculating the area of the farmland based on the dividing result of the farmland sub-regions, and distributing the soil sampling number of each farmland sub-region by taking the area as the weight.
S2, based on NDVI data, subdividing the interior of a farmland sub-region obtained by k-means in a first stage by using a local variation coefficient CV and k-means method to form a series of sub-patches with similar local spatial variation levels, and obtaining the soil sampling number of each sub-patch according to the area and the local spatial variation levels to realize k-means processing in a second stage;
s21, deducing the local spatial variation level of farmland soil:
NDVI data which are closely related to the growth situation of crops are selected, the local spatial variation level of farmland soil is deduced by utilizing a local variation coefficient CV, and data support is provided for further distribution of soil sampling points;
s22, generating sub-patches with similar local spatial variation levels of soil
Forming sub-patches with similar soil local space variation levels inside the farmland subarea by using a k-means method based on the calculation result of the local space variation level of the farmland soil; determining the clustering number in each farmland subregion according to half of the soil sampling number of each farmland subregion;
s23, acquiring the sampling number of the corresponding sub-plaques:
and calculating the area and the local coefficient of variation CV based on the sub-patch dividing result, and further distributing soil sampling points by taking the area and the local coefficient of variation CV as weights, thereby obtaining the sampling number of the corresponding sub-patches and providing quantitative support for determining the spatial position of soil sampling.
And S3, determining the space position of a representative soil sample by using a k-means method and a variance statistical means based on remote sensing estimated production data, realizing k-means treatment in a third stage, and completing the design of a farmland soil sampling scheme.
S3 specifically comprises the following substeps:
s31, forming a subset with similar crop yield level inside the sub-patches:
forming subsets with similar crop yield levels inside each sub-patch by using a k-means method based on remote sensing estimated yield data of crops; the clustering number of each sub-patch is equal to the sampling number of the corresponding sub-patch;
s32, determining the spatial position of the representative sample:
after the above process, each subset has similar terrain conditions, local soil variation level and crop yield level; to determine the spatial location of the representative sample, an expected variance is calculated; and then randomly selecting a sampling point in each subset to form a sampling point set, and searching the sampling point set closest to the expected variance as a sampling point, thereby determining the spatial position of the representative sample and forming a farmland soil sampling scheme.
According to the method, the spatial diversity and the local spatial variation level of the soil are comprehensively considered, a farmland soil sampling scheme can be designed more scientifically and reasonably to obtain soil spatial data with good quality, an accurate farmland soil spatial distribution result is further produced, and technology and data support is provided for accurate agricultural decisions such as variable fertilization and planting structure adjustment.
Drawings
FIG. 1 is a general technical flow chart of the present invention
FIG. 2 is the study area of the example
FIG. 3 shows the results of the first stage k-means processing of the embodiment
FIG. 4 shows the result of the calculation of the local coefficient of variation CV in FIG. 4 and the second stage of k-means processing according to the embodiment
FIG. 5 shows the results of a third phase k-means treatment and soil sampling protocol according to the example;
fig. 6 is a comparison of different sampling methods: (a) three-stage k-means sampling of the invention, (b) hierarchical random sampling, (c) k-means sampling and (d) regular grid sampling;
FIG. 7 is a soil SOM attribute mapping and error distribution for different sampling methods: (a) three-stage k-means sampling of the present invention, (b) hierarchical random sampling, (c) k-means sampling, and (d) regular grid sampling.
Detailed description of the preferred embodiment
The embodiments of the present invention will be described with reference to the accompanying examples.
As shown in FIG. 1, a farmland soil sampling scheme design method based on three-stage k-means, and a research area of the embodiment is shown in FIG. 2. The method comprises the following steps:
s1, constructing a farmland subregion with spatial diversity, and reasonably distributing the soil sampling quantity (k-means in the first stage)
S11, constructing a farmland subregion with space diversity by using a k-means method
The final purpose of soil sampling is to acquire and express the spatial distribution state of the soil in the research area through a small amount of soil samples. Soil, as a type of data for geography, has spatial diversity. Spatial diversity, known as spatial structured heterogeneity, refers to the fact that a certain attribute value differs between different types or regions, such as soil utilization maps, climate zones, ecological zones, geographic zones, and so forth. Some scholars will divide the study area into sub-areas with spatial diversity by means of environmental variables closely related to the soil to assist in the design of soil sampling plans. Among them, terrain is one of the most common environmental variables. In view of this, in the present embodiment, the farmland is divided by using the k-means method based on the DEM data to form farmland sub-areas (i.e., having spatial diversity) with similar terrain conditions in the areas and different terrain conditions between the areas.
The K-means method is a clustering analysis algorithm for iterative solution, and comprises the steps of dividing data into K groups in advance, randomly selecting K objects as initial clustering centers, calculating the distance between each object and each seed clustering center, and allocating each object to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster. The cluster center of a cluster is recalculated for each sample assigned, based on the existing objects in the cluster. This process will be repeated until some termination condition is met.
S12, determining the optimal number of farmland subregions by utilizing a geographic detector
Before a k-means method is used for acquiring a farmland subregion with spatial diversity, specific clustering number needs to be specified. Different numbers of clusters can result in different spatial distinctions for the field sub-regions. Spatial distinctions can be identified, examined, sought, and attributed with geographic probe Q statistics.
Geo-detectors are one way to detect spatial hierarchical heterogeneity and reveal its internal driving force. The core idea is that if an independent variable has an important influence on a dependent variable, the spatial distribution of the independent variable should have similarity with the spatial distribution of the dependent variable, and the independent variable is a tool for detecting and utilizing the spatial differentiation of the geographic phenomena. The geographic detectors include diversity and factor detection, interaction detection, ecological detection, and danger zone detection. The goal of diversity and factor detection is to detect the spatial diversity of the dependent variable and to detect how much of an independent variable explains the spatial diversity of the dependent variable. The goal of interaction detection is to identify interactions between different dependent variables, and to evaluate whether the co-action of the interaction factors will enhance or diminish its interpretation of the dependent variables.
In the method, the spatial diversity of the farmland sub-region is analyzed by selecting the diversity sum factor detection (Q statistics) in the geographic detector, and the model is as follows:
in the formula (I), the compound is shown in the specification,for classification or division of the field (field sub-area),andthe classification of farmland and the area of the whole region respectively;andthe farmland classification and the whole-region variance of the DEM variable are respectively.,The larger the value, the more obvious the spatial differentiation.
In this embodiment, based on DEM data, the spatial differentiation conditions of the field sub-regions in different numbers are detected by using the Q values in the geographic detector, a variation curve of the Q values is drawn, and the number corresponding to the inflection point is selected as the optimal number of the field sub-regions.
S13, reasonably distributing the soil sampling quantity of each farmland subregion
The farmland subregion with spatial diversity is the most fundamental basis for reasonably distributing the soil sampling quantity. Firstly, determining the total sample volume of a farmland soil sampling scheme according to the soil mapping precision or the budget limit of expenditure; and then, calculating the area of each farmland subregion, and reasonably distributing the soil sampling quantity of each farmland subregion by taking the area as the weight. The formula is as follows:
in the formula, SN is the total sample size of the farmland soil sampling scheme,is the area of the sub-region h of the farmland,is the soil sampling number of the farmland subregion h.
S2, forming sub-patches with similar local spatial variation levels of soil, and acquiring the sampling number of the corresponding sub-patches (second stage k-means)
S21 deducing local spatial variation level of farmland soil
More soil sampling points need to be arranged in the soil area with strong local spatial variation so as to obtain an accurate soil spatial distribution result. And the farmland soil in different areas has different local soil spatial variation levels. The local soil conditions affect the growth of field crops. Therefore, the local condition of the soil can be reversely deduced through the growth condition of the farmland crops. In the research, NDVI data closely related to the growth condition of crops are selected, and the local spatial variation level of farmland soil is deduced by using a local variation coefficient CV, so that data support is provided for further distribution of soil sampling points.
The local coefficient of variation CV is an index of variation expressed in relative numerical form. It is obtained by comparing the whole distance, the average difference or the standard deviation with the average in the variation index, and the standard deviation coefficient is commonly used. The condition for the application of the coefficient of variation is that when the levels of the two series being compared are different, the full-range, mean-difference or standard-difference hundred-row comparison analysis cannot be used, since they are absolute indicators. The CV formula is as follows:
in the formula (I), the compound is shown in the specification,is one of the NDVI values within the calculation window,calculating the DNVI mean within the window. The size of the calculation window needs to be determined according to the size of the study area.
S22 generating sub-patches with similar local spatial variation levels of soil
Within each field sub-area, there will be some differences in the local spatial variation levels of the soil. The soil region with high local spatial variation level needs to be provided with more soil sampling points so as to obtain an accurate soil spatial distribution result. To achieve this, the field sub-area needs to be further divided to generate sub-patches with similar local spatial variation levels of the soil.
In this embodiment, based on the calculation result of the local spatial variation level of the farmland soil, sub-patches with similar local spatial variation levels of the soil are formed inside the farmland sub-region by using the k-means method. In order to ensure that each sub-patch has a sampling point, the cluster number in each farmland sub-region is determined according to half of the soil sampling number of each farmland sub-region.
S23, acquiring sampling number of corresponding sub-patches
The larger the area of the sub-patch is, the higher the local soil spatial variation level is, and the more sampling points are required. In this embodiment, based on the sub-patch division result, the area and the local coefficient of variation CV are calculated, and the area and the local coefficient of variation CV are used as weights to further allocate soil sampling points, so as to obtain the sampling number of the corresponding sub-patches, and provide quantitative support for determining the spatial position of soil sampling. The number of sub-patches sampled is determined as follows:
in the formula (I), the compound is shown in the specification,andis a sub-region of the farmlandhInterior sub-patcheslArea and CV.
FIG. 4 shows the results of the calculation of the local coefficient of variation CV and the second stage of k-means processing.
S3, determining the space position of the representative sample (the third stage k-means)
S31 creating subsets of similar crop yield levels within the sub-blobs
Within each sub-patch, the local spatial variation level of the soil is similar, but the local soil spatial distribution is not consistent. The climate conditions, temperature conditions and disaster conditions of the same field are similar, so that the crop yield can explain the soil condition for a great length.
With the development of remote sensing technology, the rapid estimation of large-area crops becomes possible. Among them, the wobest model is the most common method for remote sensing and estimating yield of crops. The wofors (worldfoodstrudes) model is a dynamic explanatory model developed by the netherlands Wageningen agricultural university and the world grain research Center (CWFS) together, simulating annual crop growth under specific soil and climatic conditions. The model emphasizes quantitative land evaluation, regional yield forecast, risk analysis and quantitative application of annual yield change and climate change influence. The model is based on crop physiological and ecological processes such as assimilation, respiration, transpiration, dry matter distribution and the like, and mainly comprises the simulation of crop growth under potential growth conditions, water limitation conditions and nutrient limitation conditions.
In the embodiment, based on GF-1 remote sensing satellite data, a WOFOST model is utilized to perform large-area rapid estimation on crops; then, a subset with similar crop yield levels is formed inside each sub-patch using the k-means method based on remote sensing yield estimation data of the crop. The number of clusters of each sub-patch is equal to the number of samples of the corresponding sub-patch.
S32 determining the spatial position of the representative sample
After undergoing the above process, each subset has similar terrain conditions, local soil variation levels, and crop yield levels. To properly allocate the soil sample resources, the spatial location of the representative sample is determined, and the expected variance is first calculated. The formula is as follows:
in the formula (I), the compound is shown in the specification,is a sub-region of the farmlandhThe variance of the remote sensing estimated yield data of the crops is obtained. And then randomly selecting a sampling point in each subset to form a sampling point set, and searching the sampling point set closest to the expected variance as a sampling point, thereby determining the spatial position of the representative sample and forming a farmland soil sampling scheme. Fig. 5 shows the results of the third stage k-means treatment and soil sampling protocol of the example.
Different methods were used for comparison. The quantitative evaluation results of the soil SOM chart of different sampling methods are shown in a chart 1.
TABLE 1 quantitative evaluation results of SOM mapping of soil by different sampling methods
As shown in FIG. 6, FIG. 6 is (a) three-stage k-means sampling, (b) hierarchical random sampling, (c) k-means sampling, and (d) regular grid sampling, respectively.
FIG. 7 is a soil SOM attribute mapping and error distribution for different sampling methods: the method comprises the following steps of (a) three-stage k-means sampling, (b) layering random sampling, (c) k-means sampling and (d) regular grid sampling.
Claims (1)
1. A farmland soil sampling scheme design method based on three-stage k-means is characterized by comprising the following steps:
s1, forming a farmland sub-region with spatial diversity by using a k-means method and a geographical detector method based on DEM data, and distributing the soil sampling number of the farmland sub-region according to an area proportion to realize k-means processing in a first stage;
s1 specifically includes the following substeps:
s11, constructing a farmland sub-region with space diversity by using a k-means method:
dividing the farmland based on DEM data by using a k-means method to form farmland sub-regions with similar terrain conditions in the regions and different terrain conditions among the regions, namely having space diversity;
s12, determining the optimal number of farmland sub-regions by using a geographic detector:
based on DEM data, detecting the spatial differentiation conditions of the farmland sub-regions under different quantities by using the Q value in the geographic detector to obtain a change curve of the Q value, and selecting the quantity corresponding to the inflection point as the optimal quantity of the farmland sub-regions;
s13, distributing the soil sampling number of each farmland subregion:
the design of the soil sampling scheme comprises the steps of firstly determining the sample size and then determining the sample position; calculating the area of each farmland subregion based on the dividing result of the farmland subregions, and reasonably distributing the soil sampling number of each farmland subregion by taking the area as the weight; the formula is as follows:
wherein SN is the total sample size of the farmland soil sampling plan, A h Is the area of the field subregion h, SN1 h The number of soil samples in the field subregion h;
s2, based on NDVI data, subdividing the interior of a farmland sub-region obtained by k-means in a first stage by using a local coefficient of variation CV and a k-means method to form a series of sub-patches with similar local spatial variation levels, and obtaining the soil sampling number of each sub-patch according to the area and the local spatial variation levels to realize k-means processing in a second stage;
s2 specifically comprises the following substeps:
s21, deducing the local spatial variation level of farmland soil:
NDVI data related to the growth vigor of crops are selected, the local spatial variation level of farmland soil is deduced by using a local variation coefficient CV, and data support is provided for further distribution of soil sampling points;
s22, generating sub-patches with similar local spatial variation levels of soil
Forming sub-patches with similar soil local space variation levels inside the farmland sub-regions by using a k-means method based on the calculation result of the local space variation levels of the farmland soil; determining the clustering number in each farmland subregion according to half of the soil sampling number of each farmland subregion;
s23, acquiring the sampling number of the corresponding sub-plaques:
calculating the area and the local coefficient of variation CV based on the sub-patch dividing result, and further distributing soil sampling points by taking the area and the local coefficient of variation CV as weights, thereby obtaining the sampling number of corresponding sub-patches and providing quantitative support for determining the spatial position of soil sampling; the number of sub-patches sampled is determined as follows:
in the formula, A hl And CV hl Is the area and CV of the sub-patch l inside the field sub-area h;
s3, determining the spatial position of a representative soil sample by using a k-means method and a variance statistical means based on remote sensing estimated production data, realizing k-means processing of a third stage, and completing the design of a farmland soil sampling scheme;
s3 specifically comprises the following substeps:
s31, forming a subset with similar crop yield level inside the sub-patches:
forming subsets with similar crop yield levels inside each sub-patch by using a k-means method based on remote sensing estimated yield data of crops; the clustering number of each sub-patch is equal to the sampling number of the corresponding sub-patch;
s32, determining the spatial position of the representative sample:
after the above process, each subset has similar terrain conditions, local soil variation level and crop yield level; to determine the spatial location of the representative sample, an expected variance is calculated; and then randomly selecting a sampling point in each subset to form a sampling point set, and searching the sampling point set closest to the expected variance as a sampling point, thereby determining the spatial position of the representative sample and forming a farmland soil sampling scheme.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211125514.0A CN115310719B (en) | 2022-09-16 | 2022-09-16 | Farmland soil sampling scheme design method based on three-stage k-means |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211125514.0A CN115310719B (en) | 2022-09-16 | 2022-09-16 | Farmland soil sampling scheme design method based on three-stage k-means |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115310719A CN115310719A (en) | 2022-11-08 |
CN115310719B true CN115310719B (en) | 2023-04-18 |
Family
ID=83867375
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211125514.0A Active CN115310719B (en) | 2022-09-16 | 2022-09-16 | Farmland soil sampling scheme design method based on three-stage k-means |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115310719B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117271968B (en) * | 2023-11-22 | 2024-02-23 | 中国农业科学院农业环境与可持续发展研究所 | Accounting method and system for carbon sequestration amount of soil |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103196698A (en) * | 2013-03-20 | 2013-07-10 | 浙江大学 | Soil sampling method based on near-earth sensor technology |
CN110658011A (en) * | 2019-11-05 | 2020-01-07 | 新疆农业科学院土壤肥料与农业节水研究所(新疆维吾尔自治区新型肥料研究中心) | County scale orchard soil quality sampling method |
CN111222742A (en) * | 2019-11-14 | 2020-06-02 | 浙江省农业科学院 | Supplementary layout method for newly added soil sampling points based on farmland landscape partition |
CN111275072A (en) * | 2020-01-07 | 2020-06-12 | 浙江大学 | Mountain area soil thickness prediction method based on cluster sampling |
CN112765758A (en) * | 2021-02-04 | 2021-05-07 | 中国科学院南京土壤研究所 | Sample point layout method based on type unit soil attribute variation amplitude effect |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120129706A1 (en) * | 2010-11-22 | 2012-05-24 | Ashvini Chauhan | Method of Assessing Soil Quality and Health |
US20140330519A1 (en) * | 2013-05-01 | 2014-11-06 | Heiko Mueller | Method to identify multivariate anomalies by computing similarity and dissimilarity between entities and considering their spatial interdependency |
US20170042081A1 (en) * | 2015-08-10 | 2017-02-16 | 360 Yield Center, Llc | Systems, methods and apparatuses associated with soil sampling |
RS20200817A1 (en) * | 2020-07-10 | 2022-01-31 | Inst Biosens Istrazivacko Razvojni Inst Za Informacione Tehnologije Biosistema | System and method for intelligent soil sampling |
CN112733310B (en) * | 2021-01-28 | 2024-08-09 | 中国科学院南京土壤研究所 | County soil attribute investigation sample point layout method based on composite type unit |
-
2022
- 2022-09-16 CN CN202211125514.0A patent/CN115310719B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103196698A (en) * | 2013-03-20 | 2013-07-10 | 浙江大学 | Soil sampling method based on near-earth sensor technology |
CN110658011A (en) * | 2019-11-05 | 2020-01-07 | 新疆农业科学院土壤肥料与农业节水研究所(新疆维吾尔自治区新型肥料研究中心) | County scale orchard soil quality sampling method |
CN111222742A (en) * | 2019-11-14 | 2020-06-02 | 浙江省农业科学院 | Supplementary layout method for newly added soil sampling points based on farmland landscape partition |
CN111275072A (en) * | 2020-01-07 | 2020-06-12 | 浙江大学 | Mountain area soil thickness prediction method based on cluster sampling |
CN112765758A (en) * | 2021-02-04 | 2021-05-07 | 中国科学院南京土壤研究所 | Sample point layout method based on type unit soil attribute variation amplitude effect |
Also Published As
Publication number | Publication date |
---|---|
CN115310719A (en) | 2022-11-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Córdoba et al. | Protocol for multivariate homogeneous zone delineation in precision agriculture | |
Gili et al. | Comparison of three methods for delineating management zones for site-specific crop management | |
Córdoba et al. | Subfield management class delineation using cluster analysis from spatial principal components of soil variables | |
Kitchen et al. | Delineating productivity zones on claypan soil fields using apparent soil electrical conductivity | |
Fortin | Spatial statistics in landscape ecology | |
Miao et al. | An integrated approach to site-specific management zone delineation | |
Salvati et al. | The environmental “risky” region: identifying land degradation processes through integration of socio-economic and ecological indicators in a multivariate regionalization model | |
Della Chiesa et al. | Farmers as data sources: Cooperative framework for mapping soil properties for permanent crops in South Tyrol (Northern Italy) | |
CN107103378B (en) | Corn planting environment test site layout method and system | |
Palladino et al. | Developing pedotransfer functions for predicting soil bulk density in Campania | |
Chen et al. | Delineation of management zones and optimization of irrigation scheduling to improve irrigation water productivity and revenue in a farmland of Northwest China | |
Betzek et al. | Rectification methods for optimization of management zones | |
CN116050163B (en) | Meteorological station-based ecological system water flux calculation method and system | |
CN115310719B (en) | Farmland soil sampling scheme design method based on three-stage k-means | |
Zhao et al. | Spatial variability assessment of soil nutrients in an intense agricultural area, a case study of Rugao County in Yangtze River Delta Region, China | |
CN108764527B (en) | Screening method for soil organic carbon library time-space dynamic prediction optimal environment variables | |
Wu et al. | Study of the differences in soil properties between the dry season and rainy season in the Mun River Basin | |
CN114398951A (en) | Land use change driving factor mining method based on random forest and crowd-sourced geographic information | |
Kannan et al. | Development of an automated procedure for estimation of the spatial variation of runoff in large river basins | |
Tagore et al. | Mapping of degraded lands using remote sensing and GIS techniques | |
Gao | Agricultural soil data analysis using spatial clustering data mining techniques | |
Jiang et al. | Study on delineation of irrigation management zones based on management zone analyst software | |
Odusanya et al. | Using a regionalisation approach to evaluate streamflow simulated by an ecohydrological model calibrated with global land surface evaporation from remote sensing | |
Ghosh et al. | Explanation of major determinants of poverty using multivariate statistical approach and spatial technology: a case study on Birbhum district, West Bengal, India | |
Xu et al. | Evaluation method and empirical application of human activity suitability of land resources in Qinghai-Tibet Plateau |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |