CN115310719B - Farmland soil sampling scheme design method based on three-stage k-means - Google Patents

Farmland soil sampling scheme design method based on three-stage k-means Download PDF

Info

Publication number
CN115310719B
CN115310719B CN202211125514.0A CN202211125514A CN115310719B CN 115310719 B CN115310719 B CN 115310719B CN 202211125514 A CN202211125514 A CN 202211125514A CN 115310719 B CN115310719 B CN 115310719B
Authority
CN
China
Prior art keywords
farmland
sub
soil
spatial
sampling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211125514.0A
Other languages
Chinese (zh)
Other versions
CN115310719A (en
Inventor
齐清文
王永吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Geographic Sciences and Natural Resources of CAS
Original Assignee
Institute of Geographic Sciences and Natural Resources of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Geographic Sciences and Natural Resources of CAS filed Critical Institute of Geographic Sciences and Natural Resources of CAS
Priority to CN202211125514.0A priority Critical patent/CN115310719B/en
Publication of CN115310719A publication Critical patent/CN115310719A/en
Application granted granted Critical
Publication of CN115310719B publication Critical patent/CN115310719B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/188Vegetation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Remote Sensing (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Agronomy & Crop Science (AREA)
  • Animal Husbandry (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Mining & Mineral Resources (AREA)
  • Primary Health Care (AREA)
  • Astronomy & Astrophysics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a farmland soil sampling scheme design method based on three-stage k-means, which comprises the following steps: s1, forming a farmland sub-region with spatial diversity by using a k-means method and a geographical detector method based on DEM data, and distributing the soil sampling number of the farmland sub-region according to an area proportion; s2, based on NDVI data, subdividing the interior of a farmland sub-region obtained by k-means in a first stage by using a local coefficient of variation CV and a k-means method to form a series of sub-patches with similar local spatial variation levels, and obtaining the soil sampling number of each sub-patch according to the area and the local spatial variation levels; and S3, determining the spatial position of the representative soil sample by using a k-means method and a variance statistical means based on remote sensing estimated production data. The method comprehensively considers the spatial diversity and the local spatial variation level of the soil, and can more scientifically and reasonably design the farmland soil sampling scheme so as to obtain the soil spatial data with good quality.

Description

Farmland soil sampling scheme design method based on three-stage k-means
Technical Field
The invention belongs to the technical field of geographic information systems, and discloses a farmland soil sampling scheme design method based on three-stage k-means.
Background
China lives nearly 20% of the world population in land which occupies about 7% of the cultivated land area of the world. However, in recent years, china has many problems (high agricultural production cost, unreasonable use of chemical fertilizers or pesticides, low agricultural mechanization degree, soil hardening and the like) in the modern agricultural development process, so that the contradiction between population and environmental safety and food safety is increasingly prominent. The precision agriculture is a new trend of modern agriculture, is a key means for converting the resource input type into the scientific and technological type in the modern agriculture, is a mark for combining agricultural informatization, mechanization and modernization, and is helpful for solving the contradiction. The country also has paid great attention to the development problem of precision agriculture. In recent years, with the continuous acceleration of the urbanization process, many farmers enter city service workers, so that rural land is left unused, the intensive rural land management becomes a new era trend, and a new opportunity is provided for the smooth development and implementation of precision agriculture. The specific meaning of precision agriculture is that the agricultural material investment is reasonably adjusted according to farmland soil nutrients, soil nutrients required by growth of planted crops and target yield, and the purposes of improving the agricultural material utilization rate and agricultural productivity, saving resources, protecting agricultural environment and the like are achieved on the premise of meeting the growth needs of the crops.
Soil is an important component of the earth's surface system and has spatial diversity. The spatial incompatibility of the soil nutrient supply with the crop growth requirements is a fundamental cause of crop yield. Therefore, the space diversity condition of farmland soil nutrients is fully known, and the development goal of precision agriculture is facilitated. At present, the main method for mastering the spatial distribution of soil nutrients is to collect field soil samples of farmlands and perform physical and chemical analysis in a laboratory to determine the content of soil nutrients. However, large-scale, intensive collection of soil samples can consume a great deal of manpower, material resources, and financial resources. Soil sampling is a scientific method for estimating the spatial distribution of farmland soil nutrients by acquiring a small amount of key soil sampling point nutrient information and further using a certain soil prediction model, so that the sampling cost and the expected soil prediction precision can be well balanced. Different sampling schemes will obtain different spatial distribution characteristics of farmland soil nutrients. The research and the selection of a reasonable soil sampling method can be a powerful guarantee for obtaining reliable farmland soil nutrient space distribution.
At present, the soil sampling method is mainly divided into a sampling method based on design and a sampling method based on a model. The soil sampling method based on design is characterized in that the optimal sample size is determined quantitatively according to prior knowledge, and then the sample layout is determined by means of random sampling, system sampling, layered sampling and the like. For example, wang jinfeng et al propose a "sandwich" spatial sampling model based on hierarchical sampling, and considering spatial diversity. In addition, partial research utilizes the environment variable to assist in designing a sampling scheme according to the cooperative change relationship between the soil and the environment variable so as to improve the soil sampling efficiency. The model-based soil sampling method is based on geostatistical theory, and the implementation can be accurately fitted to the population through certain calibration. The method designs the optimal number of sampling points and spatial distribution pattern by minimizing the model estimation variance. For example, wang Shahua et al combines the fuzzy set theory and the sampling inspection theory, and proposes a two-stage sampling model based on spatial data quality inspection. In addition to the soil sampling method described above, recent research dynamics include a sample supplement method based on a spatial estimation uncertainty, a sampling method considering accessibility, and the like.
Compared with the soil sampling method based on design, the soil sampling method based on the model can fully excavate the structural information of the soil in the research area, thereby more acquiring the soil information of key points. However, most model-based soil sampling methods are based on geostatistical theory, and the design of sampling schemes using the method of the type depends on a variation function, and the variation function can be known only after sampling, or the soil data in front of the research area is used for modeling the variation function. Agricultural areas may have large differences in the spatial distribution of soil between different years due to the strong interference of human activities. In this case, the coefficient of variation modeling cannot be performed using the soil data before the agricultural area. Therefore, design-based soil sampling methods may be more suitable for application in precision agriculture. Different soil nutrients absorbed by different crop types are different, the complex planting structure of an agricultural area and artificial fertilization management can lead the soil in the area to have certain local spatial diversity, a new soil sampling method based on design is urgently needed to be developed, the local spatial diversity of the soil in the agricultural area is considered to identify and obtain the soil information of key points, and the precision of accurate agricultural digital soil mapping is further improved.
Disclosure of Invention
Aiming at the technical problems, the invention provides a farmland soil sampling scheme design method based on three-stage k-means, which aims to deduce the local spatial diversity of soil by excavating environmental information closely related to soil and utilizing a three-stage k-means method, determine the spatial position of a representative sample and form a farmland soil sampling scheme. In the first stage, k-means processing is carried out, a farmland subregion with space diversity is constructed, and the soil sampling quantity is reasonably distributed; a second stage of k-means processing, forming sub-patches with similar soil local space variation levels and acquiring the sampling number of the corresponding sub-patches; the third stage, k-means processing, determines the spatial location of the representative sample.
A farmland soil sampling scheme design method based on three-stage k-means comprises the following steps:
s1, forming a farmland sub-region with spatial diversity by using a k-means method and a geographical detector method based on DEM data, and distributing the soil sampling number of the farmland sub-region according to an area proportion to realize k-means processing in a first stage;
s1 specifically comprises the following substeps:
s11, constructing a farmland sub-region with space diversity by using a k-means method:
dividing the farmland based on DEM data by using a k-means method to form farmland sub-regions with similar terrain conditions in the regions and different terrain conditions among the regions, namely having space diversity;
s12, determining the optimal number of farmland sub-regions by using a geographic detector:
based on DEM data, detecting the spatial differentiation conditions of the farmland sub-regions under different quantities by using the Q value in the geographic detector to obtain a change curve of the Q value, and selecting the quantity corresponding to the inflection point as the optimal quantity of the farmland sub-regions;
s13, distributing the soil sampling quantity of each farmland subregion:
the design of the soil sampling scheme comprises the steps of firstly determining the sample size and then determining the sample position; and calculating the area of the farmland based on the dividing result of the farmland sub-regions, and distributing the soil sampling number of each farmland sub-region by taking the area as the weight.
S2, based on NDVI data, subdividing the interior of a farmland sub-region obtained by k-means in a first stage by using a local variation coefficient CV and k-means method to form a series of sub-patches with similar local spatial variation levels, and obtaining the soil sampling number of each sub-patch according to the area and the local spatial variation levels to realize k-means processing in a second stage;
s21, deducing the local spatial variation level of farmland soil:
NDVI data which are closely related to the growth situation of crops are selected, the local spatial variation level of farmland soil is deduced by utilizing a local variation coefficient CV, and data support is provided for further distribution of soil sampling points;
s22, generating sub-patches with similar local spatial variation levels of soil
Forming sub-patches with similar soil local space variation levels inside the farmland subarea by using a k-means method based on the calculation result of the local space variation level of the farmland soil; determining the clustering number in each farmland subregion according to half of the soil sampling number of each farmland subregion;
s23, acquiring the sampling number of the corresponding sub-plaques:
and calculating the area and the local coefficient of variation CV based on the sub-patch dividing result, and further distributing soil sampling points by taking the area and the local coefficient of variation CV as weights, thereby obtaining the sampling number of the corresponding sub-patches and providing quantitative support for determining the spatial position of soil sampling.
And S3, determining the space position of a representative soil sample by using a k-means method and a variance statistical means based on remote sensing estimated production data, realizing k-means treatment in a third stage, and completing the design of a farmland soil sampling scheme.
S3 specifically comprises the following substeps:
s31, forming a subset with similar crop yield level inside the sub-patches:
forming subsets with similar crop yield levels inside each sub-patch by using a k-means method based on remote sensing estimated yield data of crops; the clustering number of each sub-patch is equal to the sampling number of the corresponding sub-patch;
s32, determining the spatial position of the representative sample:
after the above process, each subset has similar terrain conditions, local soil variation level and crop yield level; to determine the spatial location of the representative sample, an expected variance is calculated; and then randomly selecting a sampling point in each subset to form a sampling point set, and searching the sampling point set closest to the expected variance as a sampling point, thereby determining the spatial position of the representative sample and forming a farmland soil sampling scheme.
According to the method, the spatial diversity and the local spatial variation level of the soil are comprehensively considered, a farmland soil sampling scheme can be designed more scientifically and reasonably to obtain soil spatial data with good quality, an accurate farmland soil spatial distribution result is further produced, and technology and data support is provided for accurate agricultural decisions such as variable fertilization and planting structure adjustment.
Drawings
FIG. 1 is a general technical flow chart of the present invention
FIG. 2 is the study area of the example
FIG. 3 shows the results of the first stage k-means processing of the embodiment
FIG. 4 shows the result of the calculation of the local coefficient of variation CV in FIG. 4 and the second stage of k-means processing according to the embodiment
FIG. 5 shows the results of a third phase k-means treatment and soil sampling protocol according to the example;
fig. 6 is a comparison of different sampling methods: (a) three-stage k-means sampling of the invention, (b) hierarchical random sampling, (c) k-means sampling and (d) regular grid sampling;
FIG. 7 is a soil SOM attribute mapping and error distribution for different sampling methods: (a) three-stage k-means sampling of the present invention, (b) hierarchical random sampling, (c) k-means sampling, and (d) regular grid sampling.
Detailed description of the preferred embodiment
The embodiments of the present invention will be described with reference to the accompanying examples.
As shown in FIG. 1, a farmland soil sampling scheme design method based on three-stage k-means, and a research area of the embodiment is shown in FIG. 2. The method comprises the following steps:
s1, constructing a farmland subregion with spatial diversity, and reasonably distributing the soil sampling quantity (k-means in the first stage)
S11, constructing a farmland subregion with space diversity by using a k-means method
The final purpose of soil sampling is to acquire and express the spatial distribution state of the soil in the research area through a small amount of soil samples. Soil, as a type of data for geography, has spatial diversity. Spatial diversity, known as spatial structured heterogeneity, refers to the fact that a certain attribute value differs between different types or regions, such as soil utilization maps, climate zones, ecological zones, geographic zones, and so forth. Some scholars will divide the study area into sub-areas with spatial diversity by means of environmental variables closely related to the soil to assist in the design of soil sampling plans. Among them, terrain is one of the most common environmental variables. In view of this, in the present embodiment, the farmland is divided by using the k-means method based on the DEM data to form farmland sub-areas (i.e., having spatial diversity) with similar terrain conditions in the areas and different terrain conditions between the areas.
The K-means method is a clustering analysis algorithm for iterative solution, and comprises the steps of dividing data into K groups in advance, randomly selecting K objects as initial clustering centers, calculating the distance between each object and each seed clustering center, and allocating each object to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster. The cluster center of a cluster is recalculated for each sample assigned, based on the existing objects in the cluster. This process will be repeated until some termination condition is met.
S12, determining the optimal number of farmland subregions by utilizing a geographic detector
Before a k-means method is used for acquiring a farmland subregion with spatial diversity, specific clustering number needs to be specified. Different numbers of clusters can result in different spatial distinctions for the field sub-regions. Spatial distinctions can be identified, examined, sought, and attributed with geographic probe Q statistics.
Geo-detectors are one way to detect spatial hierarchical heterogeneity and reveal its internal driving force. The core idea is that if an independent variable has an important influence on a dependent variable, the spatial distribution of the independent variable should have similarity with the spatial distribution of the dependent variable, and the independent variable is a tool for detecting and utilizing the spatial differentiation of the geographic phenomena. The geographic detectors include diversity and factor detection, interaction detection, ecological detection, and danger zone detection. The goal of diversity and factor detection is to detect the spatial diversity of the dependent variable and to detect how much of an independent variable explains the spatial diversity of the dependent variable. The goal of interaction detection is to identify interactions between different dependent variables, and to evaluate whether the co-action of the interaction factors will enhance or diminish its interpretation of the dependent variables.
In the method, the spatial diversity of the farmland sub-region is analyzed by selecting the diversity sum factor detection (Q statistics) in the geographic detector, and the model is as follows:
Figure DEST_PATH_IMAGE001
(1)
in the formula (I), the compound is shown in the specification,
Figure 231487DEST_PATH_IMAGE002
for classification or division of the field (field sub-area),
Figure DEST_PATH_IMAGE003
and
Figure 7682DEST_PATH_IMAGE004
the classification of farmland and the area of the whole region respectively;
Figure DEST_PATH_IMAGE005
and
Figure 584157DEST_PATH_IMAGE006
the farmland classification and the whole-region variance of the DEM variable are respectively.
Figure DEST_PATH_IMAGE007
Figure 339886DEST_PATH_IMAGE008
The larger the value, the more obvious the spatial differentiation.
In this embodiment, based on DEM data, the spatial differentiation conditions of the field sub-regions in different numbers are detected by using the Q values in the geographic detector, a variation curve of the Q values is drawn, and the number corresponding to the inflection point is selected as the optimal number of the field sub-regions.
S13, reasonably distributing the soil sampling quantity of each farmland subregion
The farmland subregion with spatial diversity is the most fundamental basis for reasonably distributing the soil sampling quantity. Firstly, determining the total sample volume of a farmland soil sampling scheme according to the soil mapping precision or the budget limit of expenditure; and then, calculating the area of each farmland subregion, and reasonably distributing the soil sampling quantity of each farmland subregion by taking the area as the weight. The formula is as follows:
Figure DEST_PATH_IMAGE009
(2)
in the formula, SN is the total sample size of the farmland soil sampling scheme,
Figure 216575DEST_PATH_IMAGE010
is the area of the sub-region h of the farmland,
Figure DEST_PATH_IMAGE011
is the soil sampling number of the farmland subregion h.
S2, forming sub-patches with similar local spatial variation levels of soil, and acquiring the sampling number of the corresponding sub-patches (second stage k-means)
S21 deducing local spatial variation level of farmland soil
More soil sampling points need to be arranged in the soil area with strong local spatial variation so as to obtain an accurate soil spatial distribution result. And the farmland soil in different areas has different local soil spatial variation levels. The local soil conditions affect the growth of field crops. Therefore, the local condition of the soil can be reversely deduced through the growth condition of the farmland crops. In the research, NDVI data closely related to the growth condition of crops are selected, and the local spatial variation level of farmland soil is deduced by using a local variation coefficient CV, so that data support is provided for further distribution of soil sampling points.
The local coefficient of variation CV is an index of variation expressed in relative numerical form. It is obtained by comparing the whole distance, the average difference or the standard deviation with the average in the variation index, and the standard deviation coefficient is commonly used. The condition for the application of the coefficient of variation is that when the levels of the two series being compared are different, the full-range, mean-difference or standard-difference hundred-row comparison analysis cannot be used, since they are absolute indicators. The CV formula is as follows:
Figure 632513DEST_PATH_IMAGE012
(3)
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE013
is one of the NDVI values within the calculation window,
Figure 197749DEST_PATH_IMAGE014
calculating the DNVI mean within the window. The size of the calculation window needs to be determined according to the size of the study area.
S22 generating sub-patches with similar local spatial variation levels of soil
Within each field sub-area, there will be some differences in the local spatial variation levels of the soil. The soil region with high local spatial variation level needs to be provided with more soil sampling points so as to obtain an accurate soil spatial distribution result. To achieve this, the field sub-area needs to be further divided to generate sub-patches with similar local spatial variation levels of the soil.
In this embodiment, based on the calculation result of the local spatial variation level of the farmland soil, sub-patches with similar local spatial variation levels of the soil are formed inside the farmland sub-region by using the k-means method. In order to ensure that each sub-patch has a sampling point, the cluster number in each farmland sub-region is determined according to half of the soil sampling number of each farmland sub-region.
S23, acquiring sampling number of corresponding sub-patches
The larger the area of the sub-patch is, the higher the local soil spatial variation level is, and the more sampling points are required. In this embodiment, based on the sub-patch division result, the area and the local coefficient of variation CV are calculated, and the area and the local coefficient of variation CV are used as weights to further allocate soil sampling points, so as to obtain the sampling number of the corresponding sub-patches, and provide quantitative support for determining the spatial position of soil sampling. The number of sub-patches sampled is determined as follows:
Figure DEST_PATH_IMAGE015
(4)
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE016
and
Figure 52441DEST_PATH_IMAGE017
is a sub-region of the farmlandhInterior sub-patcheslArea and CV.
FIG. 4 shows the results of the calculation of the local coefficient of variation CV and the second stage of k-means processing.
S3, determining the space position of the representative sample (the third stage k-means)
S31 creating subsets of similar crop yield levels within the sub-blobs
Within each sub-patch, the local spatial variation level of the soil is similar, but the local soil spatial distribution is not consistent. The climate conditions, temperature conditions and disaster conditions of the same field are similar, so that the crop yield can explain the soil condition for a great length.
With the development of remote sensing technology, the rapid estimation of large-area crops becomes possible. Among them, the wobest model is the most common method for remote sensing and estimating yield of crops. The wofors (worldfoodstrudes) model is a dynamic explanatory model developed by the netherlands Wageningen agricultural university and the world grain research Center (CWFS) together, simulating annual crop growth under specific soil and climatic conditions. The model emphasizes quantitative land evaluation, regional yield forecast, risk analysis and quantitative application of annual yield change and climate change influence. The model is based on crop physiological and ecological processes such as assimilation, respiration, transpiration, dry matter distribution and the like, and mainly comprises the simulation of crop growth under potential growth conditions, water limitation conditions and nutrient limitation conditions.
In the embodiment, based on GF-1 remote sensing satellite data, a WOFOST model is utilized to perform large-area rapid estimation on crops; then, a subset with similar crop yield levels is formed inside each sub-patch using the k-means method based on remote sensing yield estimation data of the crop. The number of clusters of each sub-patch is equal to the number of samples of the corresponding sub-patch.
S32 determining the spatial position of the representative sample
After undergoing the above process, each subset has similar terrain conditions, local soil variation levels, and crop yield levels. To properly allocate the soil sample resources, the spatial location of the representative sample is determined, and the expected variance is first calculated. The formula is as follows:
Figure DEST_PATH_IMAGE018
(5)
in the formula (I), the compound is shown in the specification,
Figure 81839DEST_PATH_IMAGE019
is a sub-region of the farmlandhThe variance of the remote sensing estimated yield data of the crops is obtained. And then randomly selecting a sampling point in each subset to form a sampling point set, and searching the sampling point set closest to the expected variance as a sampling point, thereby determining the spatial position of the representative sample and forming a farmland soil sampling scheme. Fig. 5 shows the results of the third stage k-means treatment and soil sampling protocol of the example.
Different methods were used for comparison. The quantitative evaluation results of the soil SOM chart of different sampling methods are shown in a chart 1.
TABLE 1 quantitative evaluation results of SOM mapping of soil by different sampling methods
Figure DEST_PATH_IMAGE021A
As shown in FIG. 6, FIG. 6 is (a) three-stage k-means sampling, (b) hierarchical random sampling, (c) k-means sampling, and (d) regular grid sampling, respectively.
FIG. 7 is a soil SOM attribute mapping and error distribution for different sampling methods: the method comprises the following steps of (a) three-stage k-means sampling, (b) layering random sampling, (c) k-means sampling and (d) regular grid sampling.

Claims (1)

1. A farmland soil sampling scheme design method based on three-stage k-means is characterized by comprising the following steps:
s1, forming a farmland sub-region with spatial diversity by using a k-means method and a geographical detector method based on DEM data, and distributing the soil sampling number of the farmland sub-region according to an area proportion to realize k-means processing in a first stage;
s1 specifically includes the following substeps:
s11, constructing a farmland sub-region with space diversity by using a k-means method:
dividing the farmland based on DEM data by using a k-means method to form farmland sub-regions with similar terrain conditions in the regions and different terrain conditions among the regions, namely having space diversity;
s12, determining the optimal number of farmland sub-regions by using a geographic detector:
based on DEM data, detecting the spatial differentiation conditions of the farmland sub-regions under different quantities by using the Q value in the geographic detector to obtain a change curve of the Q value, and selecting the quantity corresponding to the inflection point as the optimal quantity of the farmland sub-regions;
s13, distributing the soil sampling number of each farmland subregion:
the design of the soil sampling scheme comprises the steps of firstly determining the sample size and then determining the sample position; calculating the area of each farmland subregion based on the dividing result of the farmland subregions, and reasonably distributing the soil sampling number of each farmland subregion by taking the area as the weight; the formula is as follows:
Figure FDA0004114481650000011
wherein SN is the total sample size of the farmland soil sampling plan, A h Is the area of the field subregion h, SN1 h The number of soil samples in the field subregion h;
s2, based on NDVI data, subdividing the interior of a farmland sub-region obtained by k-means in a first stage by using a local coefficient of variation CV and a k-means method to form a series of sub-patches with similar local spatial variation levels, and obtaining the soil sampling number of each sub-patch according to the area and the local spatial variation levels to realize k-means processing in a second stage;
s2 specifically comprises the following substeps:
s21, deducing the local spatial variation level of farmland soil:
NDVI data related to the growth vigor of crops are selected, the local spatial variation level of farmland soil is deduced by using a local variation coefficient CV, and data support is provided for further distribution of soil sampling points;
s22, generating sub-patches with similar local spatial variation levels of soil
Forming sub-patches with similar soil local space variation levels inside the farmland sub-regions by using a k-means method based on the calculation result of the local space variation levels of the farmland soil; determining the clustering number in each farmland subregion according to half of the soil sampling number of each farmland subregion;
s23, acquiring the sampling number of the corresponding sub-plaques:
calculating the area and the local coefficient of variation CV based on the sub-patch dividing result, and further distributing soil sampling points by taking the area and the local coefficient of variation CV as weights, thereby obtaining the sampling number of corresponding sub-patches and providing quantitative support for determining the spatial position of soil sampling; the number of sub-patches sampled is determined as follows:
Figure FDA0004114481650000021
in the formula, A hl And CV hl Is the area and CV of the sub-patch l inside the field sub-area h;
s3, determining the spatial position of a representative soil sample by using a k-means method and a variance statistical means based on remote sensing estimated production data, realizing k-means processing of a third stage, and completing the design of a farmland soil sampling scheme;
s3 specifically comprises the following substeps:
s31, forming a subset with similar crop yield level inside the sub-patches:
forming subsets with similar crop yield levels inside each sub-patch by using a k-means method based on remote sensing estimated yield data of crops; the clustering number of each sub-patch is equal to the sampling number of the corresponding sub-patch;
s32, determining the spatial position of the representative sample:
after the above process, each subset has similar terrain conditions, local soil variation level and crop yield level; to determine the spatial location of the representative sample, an expected variance is calculated; and then randomly selecting a sampling point in each subset to form a sampling point set, and searching the sampling point set closest to the expected variance as a sampling point, thereby determining the spatial position of the representative sample and forming a farmland soil sampling scheme.
CN202211125514.0A 2022-09-16 2022-09-16 Farmland soil sampling scheme design method based on three-stage k-means Active CN115310719B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211125514.0A CN115310719B (en) 2022-09-16 2022-09-16 Farmland soil sampling scheme design method based on three-stage k-means

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211125514.0A CN115310719B (en) 2022-09-16 2022-09-16 Farmland soil sampling scheme design method based on three-stage k-means

Publications (2)

Publication Number Publication Date
CN115310719A CN115310719A (en) 2022-11-08
CN115310719B true CN115310719B (en) 2023-04-18

Family

ID=83867375

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211125514.0A Active CN115310719B (en) 2022-09-16 2022-09-16 Farmland soil sampling scheme design method based on three-stage k-means

Country Status (1)

Country Link
CN (1) CN115310719B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271968B (en) * 2023-11-22 2024-02-23 中国农业科学院农业环境与可持续发展研究所 Accounting method and system for carbon sequestration amount of soil

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103196698A (en) * 2013-03-20 2013-07-10 浙江大学 Soil sampling method based on near-earth sensor technology
CN110658011A (en) * 2019-11-05 2020-01-07 新疆农业科学院土壤肥料与农业节水研究所(新疆维吾尔自治区新型肥料研究中心) County scale orchard soil quality sampling method
CN111222742A (en) * 2019-11-14 2020-06-02 浙江省农业科学院 Supplementary layout method for newly added soil sampling points based on farmland landscape partition
CN111275072A (en) * 2020-01-07 2020-06-12 浙江大学 Mountain area soil thickness prediction method based on cluster sampling
CN112765758A (en) * 2021-02-04 2021-05-07 中国科学院南京土壤研究所 Sample point layout method based on type unit soil attribute variation amplitude effect

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120129706A1 (en) * 2010-11-22 2012-05-24 Ashvini Chauhan Method of Assessing Soil Quality and Health
US20140330519A1 (en) * 2013-05-01 2014-11-06 Heiko Mueller Method to identify multivariate anomalies by computing similarity and dissimilarity between entities and considering their spatial interdependency
US20170042081A1 (en) * 2015-08-10 2017-02-16 360 Yield Center, Llc Systems, methods and apparatuses associated with soil sampling
RS20200817A1 (en) * 2020-07-10 2022-01-31 Inst Biosens Istrazivacko Razvojni Inst Za Informacione Tehnologije Biosistema System and method for intelligent soil sampling
CN112733310B (en) * 2021-01-28 2024-08-09 中国科学院南京土壤研究所 County soil attribute investigation sample point layout method based on composite type unit

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103196698A (en) * 2013-03-20 2013-07-10 浙江大学 Soil sampling method based on near-earth sensor technology
CN110658011A (en) * 2019-11-05 2020-01-07 新疆农业科学院土壤肥料与农业节水研究所(新疆维吾尔自治区新型肥料研究中心) County scale orchard soil quality sampling method
CN111222742A (en) * 2019-11-14 2020-06-02 浙江省农业科学院 Supplementary layout method for newly added soil sampling points based on farmland landscape partition
CN111275072A (en) * 2020-01-07 2020-06-12 浙江大学 Mountain area soil thickness prediction method based on cluster sampling
CN112765758A (en) * 2021-02-04 2021-05-07 中国科学院南京土壤研究所 Sample point layout method based on type unit soil attribute variation amplitude effect

Also Published As

Publication number Publication date
CN115310719A (en) 2022-11-08

Similar Documents

Publication Publication Date Title
Córdoba et al. Protocol for multivariate homogeneous zone delineation in precision agriculture
Gili et al. Comparison of three methods for delineating management zones for site-specific crop management
Córdoba et al. Subfield management class delineation using cluster analysis from spatial principal components of soil variables
Kitchen et al. Delineating productivity zones on claypan soil fields using apparent soil electrical conductivity
Fortin Spatial statistics in landscape ecology
Miao et al. An integrated approach to site-specific management zone delineation
Salvati et al. The environmental “risky” region: identifying land degradation processes through integration of socio-economic and ecological indicators in a multivariate regionalization model
Della Chiesa et al. Farmers as data sources: Cooperative framework for mapping soil properties for permanent crops in South Tyrol (Northern Italy)
CN107103378B (en) Corn planting environment test site layout method and system
Palladino et al. Developing pedotransfer functions for predicting soil bulk density in Campania
Chen et al. Delineation of management zones and optimization of irrigation scheduling to improve irrigation water productivity and revenue in a farmland of Northwest China
Betzek et al. Rectification methods for optimization of management zones
CN116050163B (en) Meteorological station-based ecological system water flux calculation method and system
CN115310719B (en) Farmland soil sampling scheme design method based on three-stage k-means
Zhao et al. Spatial variability assessment of soil nutrients in an intense agricultural area, a case study of Rugao County in Yangtze River Delta Region, China
CN108764527B (en) Screening method for soil organic carbon library time-space dynamic prediction optimal environment variables
Wu et al. Study of the differences in soil properties between the dry season and rainy season in the Mun River Basin
CN114398951A (en) Land use change driving factor mining method based on random forest and crowd-sourced geographic information
Kannan et al. Development of an automated procedure for estimation of the spatial variation of runoff in large river basins
Tagore et al. Mapping of degraded lands using remote sensing and GIS techniques
Gao Agricultural soil data analysis using spatial clustering data mining techniques
Jiang et al. Study on delineation of irrigation management zones based on management zone analyst software
Odusanya et al. Using a regionalisation approach to evaluate streamflow simulated by an ecohydrological model calibrated with global land surface evaporation from remote sensing
Ghosh et al. Explanation of major determinants of poverty using multivariate statistical approach and spatial technology: a case study on Birbhum district, West Bengal, India
Xu et al. Evaluation method and empirical application of human activity suitability of land resources in Qinghai-Tibet Plateau

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant