CN114880373B - Soil sampling method, system, storage medium and electronic equipment - Google Patents

Soil sampling method, system, storage medium and electronic equipment Download PDF

Info

Publication number
CN114880373B
CN114880373B CN202210373152.0A CN202210373152A CN114880373B CN 114880373 B CN114880373 B CN 114880373B CN 202210373152 A CN202210373152 A CN 202210373152A CN 114880373 B CN114880373 B CN 114880373B
Authority
CN
China
Prior art keywords
sampling
objective function
soil
grid
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210373152.0A
Other languages
Chinese (zh)
Other versions
CN114880373A (en
Inventor
杨柯
高秉博
郝国杰
邢卫国
孙岐发
杨华本
段明新
冯嘉
赵喜东
姜楠
于俊博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Natural Resources Comprehensive Survey Center Of China Geological Survey
Original Assignee
Harbin Natural Resources Comprehensive Survey Center Of China Geological Survey
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Natural Resources Comprehensive Survey Center Of China Geological Survey filed Critical Harbin Natural Resources Comprehensive Survey Center Of China Geological Survey
Priority to CN202210373152.0A priority Critical patent/CN114880373B/en
Publication of CN114880373A publication Critical patent/CN114880373A/en
Application granted granted Critical
Publication of CN114880373B publication Critical patent/CN114880373B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/06Multi-objective optimisation, e.g. Pareto optimisation using simulated annealing [SA], ant colony algorithms or genetic algorithms [GA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/08Probabilistic or stochastic CAD

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Remote Sensing (AREA)
  • Operations Research (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Algebra (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Sampling And Sample Adjustment (AREA)
  • Geophysics And Detection Of Objects (AREA)

Abstract

The invention relates to the technical field of soil sampling, in particular to a soil sampling method, a system, a storage medium and electronic equipment, wherein the method comprises the following steps: constructing a characteristic space objective function and a geographic space objective function corresponding to soil of a region to be sampled; based on the characteristic space objective function and the geographic space objective function, the distribution of sampling points is determined by utilizing a multi-objective optimization algorithm, firstly, the distribution of the sampling points in the characteristic space and the geographic space is considered by constructing the characteristic space objective function and the geographic space objective function, and then, the final distribution of the sampling points is determined by adopting the multi-objective optimization algorithm, so that a scientific soil sampling method is realized, accurate analysis of organic matters of soil is facilitated, and data support is provided for improving soil fertility, soil quality and crop yield.

Description

Soil sampling method, system, storage medium and electronic equipment
Technical Field
The invention relates to the technical field of soil sampling, in particular to a soil sampling method, a soil sampling system, a storage medium and electronic equipment.
Background
The design of soil spatial sampling is the first problem to be faced by the quantitative expression of spatial distribution of soil organic matter content. The spatial prediction method for the expansion of the soil organic matter content data point surface is also one of important factors influencing the predictive drawing precision, so that the establishment of a scientific soil sampling distribution point and spatial prediction method has important significance for reducing the soil sampling cost, improving the predictive drawing precision and the like. Soil organic matters are key factors for maintaining soil fertility and improving soil quality and crop yield. The disclosed spatial variation characteristics and spatial distribution rules of the organic matter content of the soil have important theoretical and practical significance for efficient scientific management and sustainable utilization of soil resources.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a soil sampling method, a soil sampling system, a storage medium and electronic equipment.
The technical scheme of the soil sampling method is as follows:
constructing a characteristic space objective function and a geographic space objective function corresponding to soil of a region to be sampled;
and determining the distribution of the sampling points by utilizing a multi-target optimization algorithm based on the characteristic space objective function and the geographic space objective function.
The soil sampling method has the beneficial effects that:
firstly, by constructing a characteristic space objective function and a geospatial objective function, considering the distribution of sampling points in the characteristic space and the geospatial, and then adopting a multi-objective optimization algorithm to determine the final sampling point distribution, a scientific soil sampling method is realized, accurate analysis of organic matters of soil is facilitated, and data support is provided for improving soil fertility, soil quality and crop yield.
The technical scheme of the soil sampling system is as follows:
the system comprises a construction module and a determination module;
the construction module is used for: constructing a characteristic space objective function and a geographic space objective function corresponding to soil of a region to be sampled;
the determining module is used for: and determining the distribution of the sampling points by utilizing a multi-target optimization algorithm based on the characteristic space objective function and the geographic space objective function.
The soil sampling system has the following beneficial effects:
firstly, by constructing a characteristic space objective function and a geospatial objective function, considering the distribution of sampling points in the characteristic space and the geospatial, and then adopting a multi-objective optimization algorithm to determine the final sampling point distribution, a scientific soil sampling method is realized, accurate analysis of organic matters of soil is facilitated, and data support is provided for improving soil fertility, soil quality and crop yield.
A storage medium of the present invention has stored therein instructions that, when read by a computer, cause the computer to execute a soil sampling method according to any one of the above.
An electronic device of the present invention includes a processor and the storage medium described above, where the processor executes instructions in the storage medium.
Drawings
FIG. 1 is a schematic flow chart of a soil sampling method according to an embodiment of the present invention;
FIG. 2 is a Latin hypercube of two variables;
FIG. 3 is a schematic diagram of a grid sampling target;
FIG. 4 is a schematic diagram of an estimation accuracy-sample size variation curve;
FIG. 5 is a schematic flow chart of a simulated space multiplexing annealing method;
FIG. 6 is a schematic diagram of NDVI data;
FIG. 7 is a schematic diagram of grade data;
fig. 8 is a schematic diagram of DEM data;
FIG. 9 is a schematic representation of terrain wetness index data;
FIG. 10 is a schematic diagram of land utilization data;
FIG. 11 is a schematic representation of soil type data;
FIG. 12 is a schematic diagram of annual average temperature data;
FIG. 13 is a schematic diagram of annual average precipitation data;
FIG. 14 is a schematic diagram of an error curve;
FIG. 15 is a diagram showing a distribution of 30 sampling points;
FIG. 16 is a diagram showing a distribution of 200 sampling points;
FIG. 17 is a diagram showing a distribution of 400 sampling points;
FIG. 18 is a graph showing the results of interpolation prediction and soil organic matter content;
FIG. 19 is a graph showing the results of interpolation of organic matter at 30 sample points;
FIG. 20 is a diagram of an interpolation result of organic matter at 200 sampling points;
FIG. 21 is a graph showing the result of interpolation of organic matter at 300 sample points;
FIG. 22 is a schematic diagram of a distribution of 10 sampling points;
FIG. 23 is a schematic diagram of a distribution of 20 sampling points;
FIG. 24 is a schematic diagram of a distribution of 30 sampling points;
FIG. 25 is a schematic diagram of a distribution of 40 sampling points;
FIG. 26 is a schematic diagram of a distribution of 50 sampling points;
FIG. 27 is a schematic diagram of a distribution of 60 sampling points;
FIG. 28 is a schematic diagram of a distribution of 70 sampling points;
FIG. 29 is a schematic diagram of a distribution of 80 sampling points;
FIG. 30 is a schematic diagram of a distribution of 90 sampling points;
FIG. 31 is a schematic diagram of a distribution of 100 sampling points;
FIG. 32 is a schematic diagram of a distribution of 200 sampling points;
FIG. 33 is a schematic diagram of a distribution of 300 sampling points;
FIG. 34 is a schematic diagram of a distribution of 400 sampling points;
FIG. 35 is a schematic diagram of a distribution of 500 sampling points;
FIG. 36 is a schematic view of a soil sampling system according to an embodiment of the present invention;
Detailed Description
As shown in fig. 1, a soil sampling method according to an embodiment of the present invention includes the following steps:
s1, constructing a characteristic space objective function and a geographic space objective function corresponding to soil of a region to be sampled;
s2, determining sampling point distribution by utilizing a multi-objective optimization algorithm based on the characteristic space objective function and the geographic space objective function.
In S1, the process of constructing the feature space objective function includes:
s10, carrying out multidimensional feature space layering on the soil of the region to be sampled by adopting a conditional Latin hypercube method to obtain a multidimensional feature space layering result, and constructing the feature space objective function according to the multidimensional feature space layering result.
The Latin hypercube sampling method can uniformly cover the characteristic space of the variables according to the probability according to the cumulative probability function of each variable. The Latin hypercube sampling method comprises the following three steps:
1) For K variables X 1 ,X 2 ,…X k Dividing each variable into a plurality of equal probability intervals according to a probability cumulative function, and randomly selecting a cumulative probability value in each equal probability interval;
2) Calculating a variable value corresponding to the value according to an inverse function of the cumulative probability function;
3) And (3) matching a plurality of values obtained by each variable randomly or according to a certain rule to form samples uniformly covering the characteristic space. Specifically:
latin hypercube samples for two variables are shown in FIG. 2:
first, for X 1 And X 2 Probability cumulative function of (2)Uniformly layering;
then, X is used 1 And X 2 Sampling in a uniform grid formed by layering cumulative probabilities of each dimension so that there is one sample in each probability accumulation interval of each dimension, and respectively calculating X according to inverse function of probability accumulation function 1 And X 2 The eigenvalues corresponding to the samples in the probability space are then formed from pairs of eigenvalues as shown in fig. 2a and 2d, and the samples in the auxiliary variable eigenvalues are then formed as shown in fig. 2b and 2 c.
However, since the various observations of the subject are invariant in the surface matrix survey, one implementation in a joint conditional probability distribution, the determined auxiliary variable eigenvalue vector in the Latin hypercube is not necessarily in the multidimensional auxiliary variable eigenvalue space of the target zone. In order to be able to determine the sample points from the auxiliary variables, it is necessary to find combinations of auxiliary variables that are present in reality in order to be able to determine the actual sample points present. Therefore, the conditional Latin hypercube method is anti-parallel to the sampling process, and comprises the following three steps:
1) Firstly, a layering scheme of probability accumulation distribution is adopted; specifically:
adopting environment variables such as soil elevation, annual precipitation, annual average temperature and the like of a region to be sampled, grouping the environment variable probability accumulation distribution one by one according to an average method, and randomly selecting a probability value in each equal probability interval;
2) Determining a layering scheme of the characteristic values according to an inverse function of the probability cumulative function; specifically:
obtaining the true value of each environment variable by adopting the inverse function of the probability cumulative function;
3) And (3) using a heuristic optimization algorithm and using actual multidimensional auxiliary variable data to carry out iterative solution, so that each layering can be uniformly covered. Specifically:
and randomly combining the environment variables, and continuously iterating to obtain an environment variable grouping scheme which can be effectively covered by each layering.
The characteristic space objective function constructed by the method is as follows: o (O) F =w co O co +w ca O ca +w cor O cor
Wherein O is co Optimizing objective function for continuous variable, O ca Optimizing objective function for class variables, O cor Optimizing the function for correlation, w co Is O co Corresponding weight coefficient, w ca Is O ca Corresponding weight coefficient, w cor Is O cor Corresponding weight coefficients; wherein the continuous variable refers to: annual average temperature, elevation or annual precipitation, etc., category variables refer to: soil type or land use type, etc.
Wherein O is co The calculation formula of (2) is as follows:
Figure BDA0003589367060000051
where i is a percentile set according to the cumulative probability of the sample size, V 1 Representing the number of consecutive variables,
Figure BDA0003589367060000052
the ith percentile of the v-th continuous variable,/th continuous variable,>
Figure BDA0003589367060000053
the (i + 1) th percentile for the v-th continuous variable,
Figure BDA0003589367060000054
as a function, fall into the two percentile intervals +.>
Figure BDA0003589367060000055
For an ideal Latin hypercube, the number of sampling points falling within an equiprobable interval of a single variable is 1, so O co Is zero.
Wherein O is ca The calculation formula of (2) is as follows:
Figure BDA0003589367060000056
wherein V is 2 For the number of class variables, c is the number of classes in the class variable, η (x v′ =c) is a function, to represent the number of samples of class c that fall in the v' th class variable, n is the number of samples,
Figure BDA0003589367060000057
the duty cycle of the class c unit, which is the v' th class variable.
Wherein O is cor The calculation formula of (2) is as follows:
Figure BDA0003589367060000058
wherein V is 3 The sum of U and V being the number of continuous variables, i.e. V 3 +U=V 1 ,V 3 And U is at V 1 The duty ratio of the components can be allocated at will, and O is not influenced cor T v″u Correlation coefficients between v' and u variables for the overall data, i.e. for all initially formulated sample points, c vu The correlation coefficient between the v "variable and the u variable is calculated as sampled data, i.e., the final sample point calculated using the method of the present application.
In S1, the process of constructing the geospatial objective function includes:
s11, carrying out grid division on the geometric model corresponding to the region to be sampled, ensuring that at least one sampling point falls into each grid, and constructing a geospatial objective function according to the distribution of the sampling points in the grids. Specifically:
the optimization method based on the minimum distance, such as the minimum distance between the sampling point and the existing sampling point, the minimum average value of the shortest distance between the sampling point and the non-sampling point, and the like, can ensure that the sampling points are uniformly distributed in the sampling geographic space position. However, these methods are difficult to fuse with feature space due to their relatively limited distribution of samples. Therefore, the geospatial optimization distribution objective function is designed based on the grid sampling with higher elasticity, namely, the research area is uniformly divided into regular grids according to the sample size, and one sampling point in each grid is ensured. As shown in fig. 3.
In one embodiment, when the geometric model corresponding to the region to be sampled is divided into grids, the geometric model domain corresponding to the region to be sampled is uniformly divided into regular grids, at this time, the size of each grid in the regular grids is equal, the number of sampling points calculated in proportion falls into each grid, and based on the number, the geospatial objective function is calculated as follows:
Figure BDA0003589367060000061
where Row is the number of rows of the dividing grid, col is the number of columns of the dividing grid, and η is the number of columns falling from the lower left corner (x i′ ,y j ) And upper right corner (x) i′+1 ,y j+1 ) The coordinate locations define the number of samples in the grid, n is the number of samples, row×col is the number of grids divided.
In another embodiment, the geometric model domain corresponding to the region to be sampled is divided into grids with different specifications and sizes in a non-uniform manner, and based on the grids, corresponding geospatial objective functions are calculated.
The number n of sampling points can be obtained by:
drawing a picture with a detailed reliable area, sampling by adopting different sample volumes, constructing a sample volume and precision graph, and determining the sample volume according to the graph by the following two methods:
1) According to the precision requirement, determining an error on a vertical axis, and calculating the corresponding sample quantity on a horizontal axis according to a precision curve to obtain the number n of sampling points, namely, according to the accuracy of the classification precision estimation, determining the sample quantity, namely, the number n of sampling points;
2) And according to the error change curve, the number n of sampling points suitable for the sample is manually determined.
Scientific sampling designs require a smaller number of sampling points, while unreasonable sampling designs require a larger number of sampling points. The more sampling points, the more manpower and material resources and time are required for sampling and laboratory analysis, and conversely, too few sampling points may lose important soil property space information. The number n of sampling points determined based on the sample size and accuracy graph is more reasonable.
Optionally, in the above technical solution, in S2, the determining the sampling point distribution by using a multi-objective optimization algorithm includes:
s20, determining layering positions of each sampling point in the multi-dimensional feature space layering result by utilizing a multi-objective optimization algorithm, determining the spatial positions of each sampling point in the grid-divided geometric model, and determining the sampling point distribution according to the layering positions and the spatial positions of each sampling point.
1) The multi-dimensional characteristic space layering result is specifically as follows: adopting environment variables such as soil elevation, annual precipitation, annual average temperature and the like of a region to be sampled, grouping the environment variable probability accumulation distribution one by one according to an average method, and randomly selecting a probability value in each equal probability interval;
2) The layering positions in the multi-dimensional feature space layering result are: obtaining the true value of each environment variable by adopting the inverse function of the probability cumulative function;
3) Spatial position in the meshing geometric model: dividing the area to be sampled into n regular areas, which are generally square or rectangular grids;
4) Determining the sampling point distribution according to the layering position and the spatial position of each sampling point: through an iterative algorithm, the sample points obtained through final calculation can be uniformly distributed in a multidimensional feature space and a network geometric space.
Taking a multi-objective optimization algorithm as an example to describe a simulated space multi-path annealing method, specifically:
the simulated space multi-path annealing (SSMA) method is an upgraded version of the Simulated Space Annealing (SSA), which is mainly used to solve the optimization problem of single objective or to simplify the multi-objective problem into single objective, and cannot realize the synchronous optimization of multiple objectives. The application expands the single annealing path of SSA intoMultiplex annealing to obtain a simulated space multiplex annealing method for solving the problem of multi-objective sampling optimization, wherein the simulated space multiplex annealing method optimizes each optimization objective F of the multi-objective optimization function F(s) i (s) setting a respective cooling path for each adaptation function as the adaptation function, respectively. The specific implementation of the simulated space multi-path annealing multi-objective optimization solution is divided into the following four steps, as shown in fig. 5:
s100, optimizing objective function setting and data preparation: setting an optimization target and expressing the target in a function form of a sample point, f 1 (s),f 2 (s),…,f m (s). The goal of the optimization is to minimize the value of the m functions (which can be transformed into a minimization problem by way of an inverse or negative number for the case of maximization). Sampling field data is generated according to the constraint.
S101, initializing an algorithm: setting respective initial temperature, end temperature and cooling rate alpha for each optimization target 12 ,…,α m And other iteration termination conditions are set. Setting a temperature variable T corresponding to each target 1 ,T 2 ,…,T m Is the initial temperature. Generating an initial sample s1 in a random or spatially uniform manner, let s=s1, and calculating f 1 (s),f 2 (s),…,f m (s) value.
S102, disturbing S to generate a new solution S2: in s, randomly selecting a sample position in s, moving the position at random angle and distance to generate a new sample, replacing the original sample with the new sample to generate a new sampling scheme s2, and calculating an objective function f corresponding to s2 1 (s2),f 2 (s2),…,f m (s2)。
S103, determining whether to accept the new solution according to the Metropolis criterion: the probability of acceptance of s2 for each target is calculated in turn using equation (6):
Figure BDA0003589367060000071
generating a random number rand between 0 and 1, if rand<p i Target f i If the judgment result is accepted, otherwise, the target f i The result of the judgment of (2) is not acceptable. After completing the judgment for each target, judging whether the target f exists i Selecting a rejection s2, if present, rejecting s2; otherwise, choose to accept s2, let s=s2, f i (s)=f i (s 2); whether rejecting or not, cooling according to the rule in formula (7):
Figure BDA0003589367060000081
s104, convergence judgment: and selecting whether to end iteration according to each function target value, the fire-reducing temperature and other termination conditions. If the convergence condition is not satisfied, continuing to execute the step (3); otherwise, ending the iteration and outputting s.
In the simulated space multi-path annealing multi-target sampling optimization method, each annealing path autonomously selects whether to accept a new solution. The probability of accepting a new solution depends on the extent to which the new solution improves on the target, and the current temperature of the path. The better the new solution improves the target, the greater the probability that it is accepted; when the current new solution does not improve the target, the higher the temperature of the path, the larger the probability of being accepted, and simultaneously, as the temperature decreases, the probability of being accepted for the new solution of the non-improved target continuously decreases. Since whether the new solution is ultimately accepted is determined by all paths together, a vote overrule is applied, and each path aims to seek its own maximum improvement, acceptance of the new solution must advance toward the overall improvement of all targets during the optimization process.
The trade-off of different objectives is achieved by different temperatures of the respective paths. The lower the temperature of the path, the lower the probability of acceptance of the non-improvement solution, i.e., the less willing to sacrifice its own target improvement to match the improvement of other targets. In the cooling judgment, according to a cooling rule, when a new solution is accepted, each path is cooled synchronously; when the new solution is selected to be refused, only a plurality of paths which are selected to be accepted are cooled. As the iteration proceeds, the temperatures of the different paths differ. When a path selection refuses a new non-improvement solution, the corresponding target is describedThe improvement reaches a certain degree, so that the temperature of the solution is not lowered temporarily, namely the probability of accepting the solution to the non-improvement solution is not lowered, and the probability of sacrificing the solution to match with other target improvement is increased; conversely, if a path selection accepts a new non-improved solution, it indicates that it is actively searching for the optimal solution, and it remains to be optimized, thus lowering its temperature and focusing more on the improvement of its own objective. By the cooling mechanism, the simulation space multipath annealing multi-target sampling optimization method can automatically realize balance among different targets. When a emphasis is placed on all optimized multi-objectives, this can be achieved by setting a non-passing cooling rate. For important purposes, a faster cooling rate, i.e. smaller alpha, is set i Values. In the iterative optimization process, the probability of accepting a non-improved solution is lower because the temperature of the solution is reduced faster, and more improvement than other targets can be obtained.
Taking the toronto city as an area to be sampled as an example, the technical effects of a sampling method of the application are explained as follows:
1) Determining the outline of the area to be sampled, namely the sea lun city:
the Hailun city is located in the middle of Heilongjiang province, in the North of the Seilitis city, at the northeast end of Pingyan plain, in the North of Xiaoxiangan mountain foot, in the North latitude between 48 DEG 58 and 47 DEG 52', in the east longitude between 126 DEG 14 and 127 DEG 45'. The topography is high in northeast and low in southwest, and the average elevation is 239m. Total area of 4 667km 2 Wherein the cultivated area is 2 940km 2 Accounting for 63 percent of the total area of the land. The sea Lorentia belongs to the cold temperate zone continental monsoon climate, the annual precipitation is 500-600 mm, and the annual average temperature is 1-2 ℃. The soil type is mainly black soil and meadow soil. Dark brown soil, swamp soil, white slurry soil and paddy soil are distributed in small quantities.
2) Collecting data:
in 2010, a spatial system sample distribution mode is adopted, the geographic coordinates of each sampling point are positioned by using a GPS, the sampling points are numbered, and the sampling time, the sampling place and the land utilization mode are recorded. Each sampling point is used for collecting samples (0-20 cm) of plough layer soil and 1km 2 One sampling point, 4km 2 A mixed sample was sent to laboratory analysis. Each sampling point adoptsThe mixed sample is uniformly mixed by 5 soil samples within a range of 50m around the sampling point, and the measuring method is carried out according to the soil organic matter measuring method (GB 9834-88).
Various auxiliary data covering the sea-land city including DEM data, gradient data, landform data, soil type data, vegetation type data, land utilization data, population data, GDP data, TM remote sensing NDVI data, annual average temperature data, annual precipitation data and topographic humidity index data are collected, wherein the data scales are 1km, as shown in fig. 6-13.
3) Results and analysis:
(1) interpolation results of organic matter content of surface soil:
the 1170 sample points are sampled by adopting the sampling scheme, the sampling numbers are sequentially 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400 and 500, the organic matter content of the rest sample points is predicted by using a random forest algorithm, and the accuracy evaluation of the result is carried out by adopting the errors of the true value and the predicted value.
Using the above method, the interpolation errors for different numbers of samples are shown in table 1 and fig. 14.
Table 1:
Figure BDA0003589367060000091
as can be seen from fig. 13, the error reaches the inflection point when the number of samples is 30. When the number of samples is 200, the error is reduced more. When the number of samples is 400, the curve reaches the inflection point again, and the improvement of the error is limited as the sample size increases. Thus 30, 200 and 400 are chosen as recommended sampling samples, the corresponding sample point distributions are shown in fig. 15-17. That is, by adopting the method of the application, 200 to 400 sampling points and the distribution of the sampling points are determined, the investigation and monitoring effects of the original 1170 sampling points can be achieved, and the efficiency is improved by 3 to 5 times.
(2) Interpolation results of soil organic matter surface layer:
based on the three sets of optimal sampling schemes obtained, the result of interpolation prediction on the organic matters in the soil in the sea and the true value of the organic matters in the soil are shown in fig. 18, specifically: fig. 18a is a 30 sample point organic matter interpolation result, fig. 18b is a 200 sample point organic matter interpolation result, fig. 18c is a 400 sample point organic matter interpolation result, and fig. 18d is a true value of the soil organic matter content.
Based on the obtained optimal sampling scheme, interpolation prediction results of the soil organic matters in the sea-rennet city by using 1km scale are shown in fig. 19 to 21.
(3) Other sampling results:
when the number of samples, i.e., the number of sampling points, is 10, the distribution of the sampling results obtained using the above-described sampling method is as shown in fig. 22 below;
when the number of samples, i.e., the number of sampling points, is 20, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 23 below.
When the number of samples, i.e., the number of sampling points, is 30, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 24 below.
When the number of samples, i.e., the number of sampling points, is 40, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 25 below.
When the number of samples, i.e., the number of sampling points, is 50, the distribution of the sampling results obtained using the above-described sampling method is as shown in fig. 26 below;
when the number of samples, i.e., the number of sampling points, is 60, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 27 below.
When the number of samples, i.e., the number of sampling points, is 70, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 28 below.
When the number of samples, i.e., the number of sampling points, is 80, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 29 below.
When the number of samples, i.e., the number of sampling points, is 90, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 30 below.
When the number of samples, i.e., the number of sampling points, is 100, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 31 below.
When the number of samples, i.e., the number of sampling points, is 200, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 32 below.
When the number of samples, i.e., the number of sampling points, is 300, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 33 below.
When the number of samples, i.e., the number of sampling points, is 400, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 34 below.
When the number of samples, i.e., the number of sampling points, is 500, the distribution of sampling results obtained using the above-described sampling method is shown in fig. 35 below.
Spatial sampling and statistical inference based on spatial statistics are scientific methods with strict mathematical foundation, and have been widely applied to investigation and monitoring in the fields of socioeconomic, resource environment, land utilization, public health, etc. Spatial sampling methods include random sampling, convenience sampling, and destination sampling class 3:
(1) the random sampling method gives each candidate unit in the sampling space a fixed sampling probability, and performs sample selection based on the sampling probability, including simple random sampling and various evolution versions with added constraint conditions, such as hierarchical random sampling, system sampling, whole group sampling and the like;
(2) facilitating sampling prioritizes sampling costs, distributing samples in places where samples are easily acquired, such as distributing sample points along a road;
(3) the objective sampling is carried out under a specific and direct target guidance, and the objective function method is the most dominant method. According to the theory of spatial sampling and statistical inference, the spatial sampling optimization objective function needs to be designed in a targeted manner according to the properties of the statistical inference target and the population. The soil organic matter sampling design not only considers the position of a research area, but also needs to know the soil type, annual average rainfall and other geographic environment information. Local spatial prediction is therefore the statistical inference goal of soil organic matter sampling design. The target sampling has higher efficiency in local spatial prediction, so the soil organic matter sampling design is suitable for adopting a target sampling method. Currently, common spatial sampling optimization objective functions include geospatial distribution, feature spatial distribution, and prediction error estimation. In the absence of prior knowledge and auxiliary data support, spatially uniform distribution of samples is a common optimization objective, such as mean minimum distance (Minimization of the Mean of the Shortest Distances, MMSD), weighted mean minimum distance (Weighted Mean of the Shortest Distances, WMSD), and the like. When there are a plurality of auxiliary variables related to the observation target variables, a feature space distribution may be employed as an optimization target, such as latin hypercube (Latin Hypercube Sampling, LHS), equalization design (Equal Range Stratification, ER), and the like. There are also spatial sampling methods that optimize both geospatial and eigenspatial distributions, such as the spatial condition Latin hypercube (Spatial Conditioned Latin HypercubeSampling, scLHS). When the spatial variation function of the target variable can be grasped in advance in the sampling design, the prediction error estimation by the statistical inference method is generally adopted as the optimization target function, such as minimum average kriging error (Minimization of Average Kriging Variance, MAKV). In the design of the organic matter space sampling of the soil in the sea and the city, the selection of the statistical inference method should fully consider the overall property. The distribution of organic matters in the soil has obvious spatial diversity characteristic on a large scale, and simultaneously has higher spatial autocorrelation on a small scale. Therefore, the hierarchical heterogeneity surface unbiased estimation method system is adopted to carry out the spatial sampling distribution design of the soil organic matters in the sea len city.
Aiming at the design optimization requirement of the organic matter space sampling distribution point of the soil in the sea, the application explores a theoretical method for arranging the organic matter space sampling points of the soil in the sea based on a three-in-one theory of space sampling and statistics, tries to answer how many sampling points are needed and 2 problems at which positions the sampling points are arranged, is used for guiding the actual organic matter space sampling distribution point of the soil in the sea, and is used for the investigation and scientific management and control of the soil conditions of the black soil in the state.
The method comprises the steps of constructing a characteristic space objective function by using representative of characteristic space formed by sampling points in auxiliary variables, constructing a geographic space objective function by adopting space grid constraint, considering distribution of geographic space and characteristic space, optimizing the objective function to obtain a sampling scheme design, and specifically:
according to the spatial distribution of soil organic matter data and the relation between the soil organic matter data and other auxiliary data, a spatial Latin hypercube method is adopted, so that the sample point can well represent the change information of auxiliary variables and has better geographic space representativeness. The method uses each sample point to construct a characteristic space objective function in the representativeness of auxiliary variable distribution, adopts space grid constraint to construct a geographic space objective function, gives consideration to the distribution of geographic space and characteristic space, and adopts a multi-objective optimization algorithm to generate a sampling design scheme.
In the above embodiments, although steps S1, S2, etc. are numbered, only specific embodiments are given herein, and those skilled in the art may adjust the execution sequence of S1, S2, etc. according to the actual situation, which is also within the scope of the present invention, and it is understood that some embodiments may include some or all of the above embodiments.
As shown in fig. 36, a soil sampling system 200 according to an embodiment of the present invention includes a construction module 210 and a determination module 220;
the construction module 210 is configured to: constructing a characteristic space objective function and a geographic space objective function corresponding to soil of a region to be sampled;
the determining module 220 is configured to: and determining the distribution of the sampling points by utilizing a multi-target optimization algorithm based on the characteristic space objective function and the geographic space objective function.
Optionally, in the foregoing technical solution, the building module 210 includes a first building module, where the first building module is configured to:
and carrying out multidimensional feature space layering on the soil of the region to be sampled by adopting a conditional Latin hypercube method to obtain a multidimensional feature space layering result, and constructing the feature space objective function according to the multidimensional feature space layering result.
Optionally, in the foregoing solution, the building module 210 further includes a second building module, where the second building module is configured to:
and carrying out grid division on the geometric model corresponding to the region to be sampled, ensuring that at least one sampling point falls into each grid, and constructing a geospatial objective function according to the distribution of the sampling points in the grids.
Optionally, in the foregoing technical solution, the determining module 220 is specifically configured to:
and determining the layering position of each sampling point in the multi-dimensional feature space layering result by utilizing a multi-objective optimization algorithm, determining the spatial position of each sampling point in the grid-partitioned geometric model, and determining the sampling point distribution according to the layering position and the spatial position of each sampling point.
Optionally, in the above technical solution, the multi-objective optimization algorithm is an analog space multi-path annealing method.
The above steps for implementing the corresponding functions by the parameters and the unit modules in the soil sampling system 200 according to the present invention may refer to the parameters and the steps in the above embodiments of a soil sampling method, which are not described herein.
The storage medium of the embodiment of the invention stores instructions, when the instructions are read by a computer, the computer is caused to execute a soil sampling method according to any one of the above.
An electronic device according to an embodiment of the present invention includes a processor and the above-described storage medium, where the processor executes instructions in the storage medium. Wherein, the electronic equipment can be selected from a computer, a mobile phone and the like.
Those skilled in the art will appreciate that the present invention may be implemented as a system, method, or computer program product.
Accordingly, the present disclosure may be embodied in the following forms, namely: either entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or entirely software, or a combination of hardware and software, referred to herein generally as a "circuit," module "or" system. Furthermore, in some embodiments, the invention may also be embodied in the form of a computer program product in one or more computer-readable media, which contain computer-readable program code.
Any combination of one or more computer readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of the computer-readable storage medium include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination thereof. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
While embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.

Claims (5)

1. A soil sampling method, comprising:
constructing a characteristic space objective function and a geographic space objective function corresponding to soil of a region to be sampled;
determining sampling point distribution by utilizing a multi-objective optimization algorithm based on the characteristic space objective function and the geographic space objective function;
the process of constructing the feature space objective function comprises the following steps:
carrying out multidimensional feature space layering on the soil of the region to be sampled by adopting a conditional Latin hypercube method to obtain a multidimensional feature space layering result, and constructing the feature space objective function according to the multidimensional feature space layering result;
the process of constructing the geospatial objective function includes:
performing grid division on the geometric model corresponding to the region to be sampled, ensuring that at least one sampling point falls into each grid, and constructing a geospatial objective function according to the distribution of the sampling points in the grids;
when the geometric model corresponding to the region to be sampled is divided into grids, the geometric model domain corresponding to the region to be sampled is uniformly divided into regular grids, the size of each grid in the regular grids is equal, the number of sampling points calculated in proportion falls into each grid, and the calculated geographic space objective function is as follows:
Figure FDA0004164124100000011
where Row is the number of rows of the dividing grid, col is the number of columns of the dividing grid, and η is the number of columns falling from the lower left corner (x i′ ,y j ) And upper right corner (x) i′+1, y j+1 ) The coordinate location defines the number of sampling points in the grid, n is the number of sampling points, and Row×Col is the number of divided grids;
the determining the sampling point distribution by using the multi-objective optimization algorithm comprises the following steps:
and determining the layering position of each sampling point in the multi-dimensional feature space layering result by utilizing a multi-objective optimization algorithm, determining the spatial position of each sampling point in the grid-partitioned geometric model, and determining the sampling point distribution according to the layering position and the spatial position of each sampling point.
2. The soil sampling method of claim 1, wherein the multi-objective optimization algorithm is an analog spatial multiplexing annealing method.
3. The soil sampling system is characterized by comprising a construction module and a determination module;
the construction module is used for: constructing a characteristic space objective function and a geographic space objective function corresponding to soil of a region to be sampled;
the determining module is used for: determining sampling point distribution by utilizing a multi-objective optimization algorithm based on the characteristic space objective function and the geographic space objective function;
the building modules include a first building module to:
carrying out multidimensional feature space layering on the soil of the region to be sampled by adopting a conditional Latin hypercube method to obtain a multidimensional feature space layering result, and constructing the feature space objective function according to the multidimensional feature space layering result;
the build module further includes a second build module to:
performing grid division on the geometric model corresponding to the region to be sampled, ensuring that at least one sampling point falls into each grid, and constructing a geospatial objective function according to the distribution of the sampling points in the grids;
when the geometric model corresponding to the region to be sampled is divided into grids, the geometric model domain corresponding to the region to be sampled is uniformly divided into regular grids, the size of each grid in the regular grids is equal, the number of sampling points calculated in proportion falls into each grid, and the calculated geographic space objective function is as follows:
Figure FDA0004164124100000021
where Row is the number of rows of the dividing grid, col is the number of columns of the dividing grid, and η is the number of columns falling from the lower left corner (x i′ ,y j ) And upper right corner (x) i′+1 ,y j+1 ) The coordinate location defines the number of sampling points in the grid, n is the number of sampling points, and Row×Col is the number of divided grids;
the determining module is specifically configured to:
and determining the layering position of each sampling point in the multi-dimensional feature space layering result by utilizing a multi-objective optimization algorithm, determining the spatial position of each sampling point in the grid-partitioned geometric model, and determining the sampling point distribution according to the layering position and the spatial position of each sampling point.
4. A storage medium having instructions stored therein which, when read by a computer, cause the computer to perform a soil sampling method as claimed in claim 1 or 2.
5. An electronic device comprising a processor and the storage medium of claim 4, the processor executing instructions in the storage medium.
CN202210373152.0A 2022-04-11 2022-04-11 Soil sampling method, system, storage medium and electronic equipment Active CN114880373B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210373152.0A CN114880373B (en) 2022-04-11 2022-04-11 Soil sampling method, system, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210373152.0A CN114880373B (en) 2022-04-11 2022-04-11 Soil sampling method, system, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN114880373A CN114880373A (en) 2022-08-09
CN114880373B true CN114880373B (en) 2023-05-30

Family

ID=82670367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210373152.0A Active CN114880373B (en) 2022-04-11 2022-04-11 Soil sampling method, system, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114880373B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117540409B (en) * 2024-01-10 2024-04-19 中化现代农业有限公司 Soil sampling sample point encryption method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227965A (en) * 2016-07-29 2016-12-14 武汉大学 A kind of soil organic matter Spatial sampling network design method taking spatial and temporal distributions non-stationary characteristic into account
CN107607692A (en) * 2017-11-07 2018-01-19 中国水利水电科学研究院 Monitoring soil moisture Optimizing method based on soil maximum moisture storage capacity

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227965A (en) * 2016-07-29 2016-12-14 武汉大学 A kind of soil organic matter Spatial sampling network design method taking spatial and temporal distributions non-stationary characteristic into account
CN107607692A (en) * 2017-11-07 2018-01-19 中国水利水电科学研究院 Monitoring soil moisture Optimizing method based on soil maximum moisture storage capacity

Also Published As

Publication number Publication date
CN114880373A (en) 2022-08-09

Similar Documents

Publication Publication Date Title
Tognelli et al. Analysis of determinants of mammalian species richness in South America using spatial autoregressive models
CN114254561A (en) Waterlogging prediction method, waterlogging prediction system and storage medium
CN111612055B (en) Weather situation typing method, air pollution condition prediction method and device
Zhu et al. Schistosoma japonicum transmission risk maps at present and under climate change in mainland China
CN110597873A (en) Precipitation data estimation method, precipitation data estimation device, precipitation data estimation equipment and storage medium
CN111260111A (en) Runoff forecast improvement method based on meteorological big data
CN113051638B (en) Building height optimal configuration method and device
CN114880373B (en) Soil sampling method, system, storage medium and electronic equipment
Fan et al. Spatially filtered ridge regression (SFRR): A regression framework to understanding impacts of land cover patterns on urban climate
Engelbrecht et al. Reconsidering environmental diversity (ED) as a biodiversity surrogacy strategy
CN113311512A (en) Photovoltaic power station solar radiation short-term forecasting method based on satellite radiation product
Bouallègue et al. Statistical modeling of 2-m temperature and 10-m wind speed forecast errors
Hermoso et al. Species distributions represent intraspecific genetic diversity of freshwater fish in conservation assessments
Liu et al. GNSS-derived PWV and meteorological data for short-term rainfall forecast based on support vector machine
Yang et al. Optimizing building spatial morphology to alleviate human thermal stress
CN117251520A (en) Method and device for identifying biodiversity key region and electronic equipment
CN115062859B (en) Method and device for predicting density of gerbil unguiculatus
CN115795912A (en) Millimeter wave radar data assimilation icing prediction method and system
CN115481366A (en) Method for measuring and calculating farmland resource production potential based on space downscaling regression model
CN114325877A (en) Method and device for evaluating weather forecast data
Quan et al. Nonlinear effects of blue-green space variables on urban cold islands in Zhengzhou analyzed with random forest regression
Giuliano et al. Dark future for a black salamander: effects of climate change and conservation implications for an endemic alpine amphibian
CN118400748B (en) Unmanned aerial vehicle base station site selection method and related equipment
CN109886497A (en) Surface air temperature interpolation method based on the improved inverse distance weight of latitude
CN117933476B (en) Vegetation character spatial distribution estimation method for multi-year frozen soil region of Qinghai-Tibet plateau

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant