CN115374709A - Land analysis method and system based on deep forest model and FLUS model - Google Patents

Land analysis method and system based on deep forest model and FLUS model Download PDF

Info

Publication number
CN115374709A
CN115374709A CN202211045930.XA CN202211045930A CN115374709A CN 115374709 A CN115374709 A CN 115374709A CN 202211045930 A CN202211045930 A CN 202211045930A CN 115374709 A CN115374709 A CN 115374709A
Authority
CN
China
Prior art keywords
land
model
data
driving factor
driving
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211045930.XA
Other languages
Chinese (zh)
Inventor
刘小平
庄浩铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202211045930.XA priority Critical patent/CN115374709A/en
Publication of CN115374709A publication Critical patent/CN115374709A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/08Probabilistic or stochastic CAD

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a land analysis method and a land analysis system based on a deep forest model and a FLUS model, which are used for acquiring driving factor data and multi-period land utilization distribution data for driving land utilization change; sampling the multi-period land utilization distribution data according to a land expansion analysis strategy, and collecting the driving factor data to obtain a training sample set; inputting the training sample set into a deep forest model for training to obtain the land change development suitability data; inputting the land change development suitability data, the neighborhood effect, the conversion cost and the inertia coefficient into a FLUS model for simulation to obtain simulated land utilization distribution data; the driving factor data set is calculated through the deep forest model, contribution degree weight data of each driving factor to each land change are obtained, the land use change simulation precision of the FLUS model is effectively improved, the influence of randomness is small, stability is good, and the ability of the FLUS model for accurately analyzing a land use driving mechanism is given.

Description

Land analysis method and system based on deep forest model and FLUS model
Technical Field
The invention relates to the technical field of geographic information science, in particular to a land analysis method and system based on a deep forest model and a FLUS model.
Background
Land is an important place for social systems and natural systems to interact. Human activities profoundly affect land utilization patterns, and land changes in turn affect the natural environment and human society. The research on historical and future land changes is an important basis for evaluating the influence of the land changes and making effective relieving measures so as to realize the sustainable development of the human society. The CA model can be used for researching the interaction between the land change and the natural environment and social economic factors, simulating the complex space-time dynamic state of the land, researching the driving mechanism of the land change, and is widely applied to the simulation research of the land change at present. The FLUS model introduces an adaptive inertia mechanism on the basis of CA, is one of the most widely applied CA models nowadays, and has been successfully applied to land use change simulation research on an area scale and a global scale.
The method is used for modeling the complex nonlinear relation between the driving factors of various spaces and non-spaces and the land change, and mining the accurate land change rule, and is an important component of a CA model. In early research, the CA land change rule is mined by adopting the traditional data mining method, including machine learning, bionics and ensemble learning methods. However, these methods have weak expression ability, resulting in low simulation accuracy of the model.
Recent research attempts to apply deep neural network methods to mine CA land change rules. Although the deep neural network has sufficient expression capacity, the model is complex, the calculation efficiency is low, the deep neural network is easily influenced by hyperparameters, the usability is low, and the deep neural network is difficult to be used for a driving mechanism of land change.
Therefore, the prior art needs to be further improved and improved.
Disclosure of Invention
The purpose of the invention is as follows: the land analysis method and system based on the deep forest model and the FLUS model are provided, the problems that a traditional machine learning model is poor in expression capability and a deep neural network is too complex are solved, the land utilization change simulation precision of the FLUS model is improved, the possibility is provided for the importance analysis of land utilization change driving factors, and the ability of the FLUS model for accurately analyzing a land utilization driving mechanism is given.
In order to achieve the purpose, the invention provides a land analysis method and a land analysis system based on a deep forest model and a FLUS model.
In a first aspect, the invention provides a land analysis method based on a deep forest model and a FLUS model, wherein the method comprises the following steps: acquiring driving factor data and land use distribution data for driving land use change;
sampling the multi-period land utilization change data according to a land expansion analysis strategy, and collecting the driving factor data to obtain a training sample set;
inputting the training sample set into a deep forest model for training to obtain the land change development suitability data;
inputting the land change development suitability data, the neighborhood effect, the conversion cost and the inertia coefficient into a FLUS model for simulation to obtain simulated land use distribution data;
and calculating the driving factor data set through the depth forest model to obtain contribution degree weight data of each driving factor to each land change.
In a further embodiment, the sampling the multi-phase land use distribution data and collecting the driving factor data according to a land augmentation analysis strategy, the obtaining a training sample set, further includes:
extracting an expansion area from the multi-period historical land utilization distribution data by adopting a superposition analysis method;
carrying out layered random sampling in the expansion area to obtain a sample position;
and constructing the training sample set according to the driving factor data and the land utilization data corresponding to the sample positions.
In a further embodiment, the inputting the training sample set into a deep forest model for training to obtain the suitability data of each land change development further includes:
the training sample set is represented as follows:
{x 1 ,x 1 ,…,x i ,…,x N ,y j in which x i Representing the value of the ith driving factor, N being the number of driving factors, y j The land utilization types corresponding to the land changes, wherein M is the number of the land utilization types;
modeling a nonlinear mapping relation between a driving factor and land change through a deep forest model, and taking an average value of classification probability vectors output by a basic forest in the last layer of the deep forest model as a land utilization change development suitability value;
the classification probability vector is expressed as follows:
{P 1 ,P 2 ,…,P j ,…,P M }
the average value of the classification probability vectors is calculated by the following formula:
Figure BDA0003820169510000031
wherein, P j Is land use type y j Classification probability of H t (X) is the classification result of a single decision tree, and X is a driving factor X in a training sample set i And (4) corresponding characteristic vectors, wherein T is the number of decision trees in the common forest, and I is an indication function.
In a further embodiment, the calculating the driving factor data set by the depth forest model to obtain the contribution degree weight data of each driving factor to each land change further includes:
the deep forest model comprises a multilayer cascade structure, each layer is provided with a plurality of common forest models, and each common forest is composed of decision trees;
selecting a division rule, and calculating a kini coefficient of a node where the driving factor data set is located according to the division rule; wherein, the dividing rule is defined as θ, θ = (i, t), i represents a driving factor variable, t represents that a Node where the threshold driving factor data set is divided is defined as Node, and the kini coefficient is defined as G, which is as follows:
Figure BDA0003820169510000041
Figure BDA0003820169510000042
dividing the driving factor data set into two new nodes according to the following division rule:
Node left ={(X,y)|x i ≤t}
Node right =Node/Node left
wherein, node left And Node right For the two new nodes after division, the number of samples contained in the two data sets is m respectively left And m right
The sum of the kini coefficients of the two new nodes is calculated by the following formula:
Figure BDA0003820169510000043
the impurity degree reduction value of the division rule θ is calculated using the following formula:
D(Node|θ)=G(Node)-G(Node|θ)
partitioning nodes by adopting a maximum non-purity reduction value, and constructing a decision tree: wherein the maximum non-purity reduction value division node is represented as theta * ,θ * =argmax θ D(Node|θ)。
In a further embodiment, the maximum impure reduction value is used as a contribution weight of a driving factor, and the contribution degree of the driving factor to each land variation is evaluated.
In a further embodiment, said weighting said maximum pureness reduction value as a contribution weight of a driving factor further comprises:
training each basic forest model of each layer of forest model in the deep forest model to obtain the contribution weight of each driving factor in each basic forest;
calculating the average value of the contribution weight of each driving factor in each basic forest and the average value of the contribution weight of each driving factor in each layer of forest model;
averaging the average values of the contribution weights of the driving factors in the forest models of all layers to obtain integrated contribution degrees, and forming contribution degree weight data of the driving factors on land changes by the integrated contribution degrees.
In a further embodiment, the simulated land use distribution data is evaluated, and the accuracy of the simulated land use distribution data of the DF-FLUS model is obtained by comparing and evaluating the real land use distribution data with the simulated land use distribution data of the DF-FLUS model.
In a second aspect, the invention provides a land analysis system based on a deep forest model and a plus model, wherein the system comprises:
the data collection module is used for acquiring driving factor data for driving land use change and multi-period land use distribution data;
the data sampling module is used for sampling the multi-period land utilization distribution data according to a land expansion analysis strategy, and collecting the driving factor data to obtain a training sample set;
the model training module is used for inputting the training sample set into a deep forest model for training to obtain the change development suitability data of each land;
the data processing module is used for inputting the land change development suitability data, the neighborhood effect, the conversion cost and the inertia coefficient into the FLUS model for simulation to obtain simulated land utilization distribution data;
and the data evaluation module is used for calculating the driving factor data set through the deep forest model to obtain contribution degree weight data of each driving factor to each land change.
In a third aspect, the present invention further provides a computer device, including a processor and a memory, where the processor is connected to the memory, the memory is used for storing a computer program, and the processor is used for executing the computer program stored in the memory, so that the computer device executes the steps for implementing the method.
In a fourth aspect, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above method.
The invention provides a land analysis method and a land analysis system based on a deep forest model and a FLUS model, and compared with the prior art, the land analysis method and the land analysis system have the beneficial effects that:
the method has the advantages that the relation between the driving factors and the land use change is modeled based on the deep forest model, a more accurate land use change rule compared with a traditional machine learning method can be obtained, meanwhile, the model is easier to use compared with an existing deep neural network, the land use change simulation precision of the FLUS model can be effectively improved, the influence of randomness is small, the stability is higher, and the ability of the FLUS model for accurately analyzing a land use driving mechanism is given.
Drawings
FIG. 1 is a schematic flow chart of a land analysis method based on a deep forest model and a FLUS model according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a land use change simulation and driving mechanism analysis method based on a deep forest model and a FLUS model according to an embodiment of the invention;
FIG. 3 is a frame diagram of a deep forest model provided by an embodiment of the invention;
FIG. 4 is a graph showing the comparison between the simulation accuracy of the DF-FLUS and the simulation accuracy of the conventional method according to the embodiment of the present invention;
FIG. 5 is a comparative plot of actual land use patterns in 2010 from a research area provided by an embodiment of the present invention and details of the land use pattern distributions simulated by the present invention and a general method;
FIG. 6 is a driving factor contribution for land use change data for a research area 2000-2010 calculated by the present invention;
FIG. 7 is a graph comparing the stochastic volatility of the results of the driving factor contribution analysis of the present invention and the conventional method;
FIG. 8 is a block diagram of a land analysis system based on a deep forest model and a FLUS model according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the accompanying drawings, which are given solely for the purpose of illustration and are not to be construed as limitations of the invention, including the drawings which are incorporated herein by reference and for illustration only and are not to be construed as limitations of the invention, since many variations thereof are possible without departing from the spirit and scope of the invention.
In one embodiment, as shown in fig. 1, the present invention provides a land analysis method based on a deep forest model and a plus model, wherein the method comprises:
s1, acquiring driving factor data and multi-period land utilization distribution data for driving land utilization change;
wherein, the driving factor data comprises natural environment factors and social economic factors;
the natural environment factors comprise elevation, gradient, distance to a river, annual average temperature, annual average precipitation, temperature seasonality and precipitation seasonality;
the socioeconomic factors include a distance to a provincial center, a distance to a city center, a distance to a county center, a distance to an airport, a distance to a highway, a railway, and a distance to a general road;
land utilization distribution data including cultivated land, forest land, water and construction land.
S2, sampling the multi-period land utilization distribution data according to a land expansion analysis strategy, and collecting the driving factor data to obtain a training sample set;
preferably, the sampling the multi-period land use distribution data according to a land expansion analysis strategy and collecting the driving factor data to obtain a training sample set further includes:
extracting an expansion area from the multi-period historical land utilization distribution data by adopting a superposition analysis method;
carrying out layered random sampling in the expansion area to obtain a sample position;
and constructing the training sample set according to the driving factor data and the land utilization data corresponding to the sample positions.
The land expansion analysis strategy extracts an expansion area from multi-period historical land utilization distribution data through superposition analysis, layered random sampling is carried out in the expansion area to obtain a sample position, and then a training sample is constructed based on driving factor data and land utilization data. The sampling mode only focuses on new land types, and old land types are ignored, so that the complexity of sampling and model training is effectively reduced while the land use change characteristics are captured.
S3, inputting the training sample set into a deep forest model for training to obtain the land change development suitability data;
preferably, the inputting the training sample set into a deep forest model for training to obtain the suitability data of development of each land change further includes:
the training sample set is represented as follows:
{x 1 ,x 1 ,…,x i ,…,x N ,y j in which x i Representing the value of the ith driving factor, N being the number of driving factors, y j The land utilization types corresponding to the land changes, wherein M is the number of the land utilization types;
modeling a nonlinear mapping relation between a driving factor and land change through a deep forest model, and taking an average value of classification probability vectors output by a basic forest in the last layer of the deep forest model as a land utilization change development suitability value;
the classification probability vector is expressed as follows:
{P 1 ,P 2 ,…,P j ,…,P M }
the average value of the classification probability vectors is calculated by the following formula:
Figure BDA0003820169510000081
wherein, P j Is land use type y j Classification probability of H t (X) is the classification result of a single decision tree, and X is a driving factor X in a training sample set i And (4) corresponding characteristic vectors, wherein T is the number of decision trees in the common forest, and I is an indication function.
The deep forest model is a deep learning method based on an integrated learning strategy and has the characteristics of few hyper-parameters, characteristic conversion in the model, hierarchical characteristic processing and adaptive model complexity. And modeling a nonlinear mapping relation between the driving factors and the land changes by using the deep forest model, and calculating development suitability values of various land changes. The deep forest is an integrated forest model with a cascade structure, which is composed of K layers, each layer is composed of F common forest models (random forest and extreme forest are selected as models), each common forest of each layer in the cascade structure receives all the characteristics output by the previous layer and then outputs M advanced characteristics, and then the advanced characteristics (F multiplied by M in total) output by the F common forests are connected with the original input characteristics (N in total) to be used as a new characteristic vector to be input to the next layer of the model.
The land use change simulation precision of the FLUS can be remarkably improved by adopting a deep forest model with high expression capability and usability to excavate CA land change rules.
S4, inputting the land change development suitability data, the neighborhood effect, the conversion cost and the inertia coefficient into a FLUS model for simulation to obtain simulated land use distribution data;
preferably, the simulated land use distribution data are evaluated, and the real land use distribution data are compared with the simulated land use distribution data of the DF-FLUS model for evaluation, so that the accuracy of the simulated land use distribution data of the DF-FLUS model is obtained;
the deep forest model can adaptively determine the complexity of the model, the size of the model adaptive to the problem scale can be determined under the condition of using default parameters, the reduction of fitting accuracy caused by an overlarge or undersize network in a CNN (CNN) model is avoided, the simulated land utilization mode obtained through comparison and analysis is closer to the actual land utilization mode compared with a common method, the high-accuracy characteristic of the DF-FLUS model is highlighted, and the land utilization change simulation accuracy of the FLUS can be effectively improved by using the deep forest model.
S5, calculating a driving factor data set through the depth forest model to obtain contribution degree weight data of each driving factor to each land change;
preferably, the calculating the driving factor data set through the depth forest model to obtain the contribution degree weight data of each driving factor to each land change further includes:
the deep forest model comprises a multilayer cascade structure, each layer is provided with a plurality of common forest models, and each common forest is composed of decision trees;
selecting a division rule, and calculating a kini coefficient of a node where the driving factor data set is located according to the division rule; wherein, the division rule is defined as θ, θ = (i, t), i represents a driving factor variable, t represents that a Node where a threshold driving factor data set is divided is defined as Node, and the kini coefficient is defined as G, which specifically includes:
Figure BDA0003820169510000101
Figure BDA0003820169510000102
dividing the driving factor data set into two new nodes according to the following division rule:
Node left ={(X,y)|x i ≤t}
Node right =Node/Node left
wherein, node left And Node right For two new nodes after division, the samples contained in the two data setsThe number is m respectively left And m right
The sum of the kini coefficients of the two new nodes is calculated by the following formula:
Figure BDA0003820169510000103
calculating an impurity degree reduction value of the division rule theta by using the following formula:
D(Node|θ)=G(Node)-G(Node|θ)
dividing nodes by adopting a maximum non-purity reduction value, and constructing a decision tree: wherein the maximum non-purity reduction value division node is represented as theta * ,θ * =argmax θ D(Node|θ);
Taking the maximum impure degree reduction value as a contribution weight of a driving factor, and evaluating the contribution degree of the driving factor to each land change;
the weighting the contribution of the maximum impure degree reduction value as a driving factor further comprises:
training each basic forest model of each layer of forest model in the deep forest model to obtain the contribution weight of each driving factor in each basic forest;
calculating the average value of the contribution weight of each driving factor in each basic forest and the average value of the contribution weight of each driving factor in each layer of forest model;
averaging the average values of the contribution weights of the driving factors in the forest models of all the layers to obtain integrated contribution degrees, and forming contribution degree weight data of the driving factors on land changes by the integrated contribution degrees;
in one embodiment, a variable importance analysis method is used, i.e. important variables can classify samples into more definite categories and thus have lower impurity values, so that the average reduction in impurity (MDI) in the forest model can be used for variable importance assessment. The weighted average of the value of the reduction of the degree of non-purity of the variable at all nodes of all decision trees in the forest is the MDI of the variable, the MDI of all variables is normalized, and therefore the MDI can also evaluate the contribution weight of the factor.
Because each layer of the deep forest model is formed by the forest model, the contribution weight analysis of the driving factors can be carried out by using the variable importance of internal basic forest calculation.
And (2) training each basic forest n of the layer 1 for k times to obtain k internal models { Fn.1, fn.2, \ 8230;, fn.k }, wherein each model calculates a corresponding contribution degree during training, and finally obtains k variable contribution degrees { Cn.1, cn.2, \ 8230;, cn.k }. The k contribution degrees output by one forest are averaged to obtain Cn, the operations are repeated on all random forests and extreme forests in the layer 1 to obtain { C1, C2, \8230;, cn }, and finally the average value is taken as the final integrated contribution degree. Meanwhile, the calculation of the importance of the variable and the training process of the deep forest model are synchronous, and no additional time cost exists;
all MDI information calculated in the process of constructing the deep forest model is fully utilized, the influence of randomness on importance evaluation can be effectively reduced, and a driving mechanism of land utilization change is more accurately analyzed.
Based on the relation between the driving factor and the land use change of the deep forest model modeling, a more accurate land use change rule (development suitability) compared with the traditional machine learning method can be obtained, meanwhile, the model is easier to use compared with the existing deep neural network, the land use change simulation precision of the FLUS model can be effectively improved, the variable importance analysis method is less influenced by randomness and is more stable, and the ability of the FLUS model for accurately analyzing the land use change driving mechanism is given.
In one embodiment, the study subjects in the present invention are three areas of pearl in china, and the data used in the study areas are: CLUD land utilization distribution data of 2000 and 2010, the land utilization data was classified into 4 categories (arable land, woodland, water body, construction land). Natural environmental factors (elevation, gradient, distance to river, annual average temperature, annual average precipitation, seasonal temperature, seasonal precipitation) and socioeconomic factors (distance to provincial center, distance to city center, distance to county center, distance to airport, distance to highway railway, distance to general road). All factors are normalized to [0,1] to eliminate the effects of scale effects. The spatial resolution of the data is uniformly resampled to 100m.
As shown in fig. 2, the method of the present invention comprises the steps of:
step 1: CLUD land utilization distribution data of 2000 and 2010 are collected, a space vector database is collected, the distance from each pixel to an element in the area is calculated by using GIS software, and a distance driving factor layer with resolution of 100m is generated. And collecting elevation data with the resolution of 100m, calculating the gradient of each pixel in the area by using GIS software, and generating a gradient driving factor layer. Collecting climate grid data, projecting and resampling to 100m resolution, and finally obtaining 13 driving factors;
step 2: using the dilation analysis strategy from the year 2000 and 2010 land use data, 4 types of sample locations were sampled randomly in layers, namely "0: remain unchanged "," 1: change to arable land "," 2: changes were woodland "," 3: and the change is the construction land. In this example, the water body is assumed to be unchanged. For types 1, 2, 3, 5% of samples were sampled, respectively. For type 0, the number of samples is the same as the total number of samples of the first three types. Obtaining sample data from the drive factor data and the land use distribution data based on the obtained sample position;
and 3, step 3: inputting the samples into a deep forest model for training, fitting the relation between the driving factors and the land use change types, setting the number F of parameter basic forests of the deep forests to be 2, setting the number of decision trees of each basic forest to be 100, and setting the number of the decision trees as default parameters. Other comparative models were also trained using default parameters. After the model training is finished, inputting a complete driving factor data set to obtain a land use change development suitability surface of a research area;
and 4, step 4: as shown in fig. 3, the obtained land use change development suitability surface, the neighborhood effect, the conversion cost, and the inertia coefficient are used to calculate the total conversion probability of the land use change, and then the new land use type of each pixel is determined by using the wheel disc. The steps are repeated until the land utilization amount reaches the requirement, and the simulation is stopped. The example uses the land requirement of 2010 as the termination condition of the simulation;
preferably, as shown in fig. 4, the overall accuracy of the method is greatly improved compared with that of a common method, and particularly, compared with a traditional deep learning model CNN, the CNN uses default parameters, parameter tuning is not performed, only the simulation accuracy is low, but the complexity of the model can be adaptively determined by a deep forest model, and under the condition of using the default parameters, the size of the model adapted to the problem scale can be determined, so that the fitting accuracy reduction caused by an excessively large or excessively small network in the CNN model is avoided;
as shown in fig. 5 and 6, comparing the actual land use pattern of the bead triangle in 2010 with the land use pattern simulated by the present invention and the general method, the simulated land use pattern of the present invention is closer to the general method and the actual one.
And 5, step 5: and calculating the contribution degree of the driving factor to the land change by using the MDI information saved in the deep forest model training process. The calculation mode is that the normalized MDI values output by the k-fold cross validation training of all the basic forests on the first layer of the deep forest model are averaged.
Preferably, as shown in fig. 7, the contribution degree of each driving factor calculated by the invention to the land use change in the research area in 2000-2010 is applied;
compared with the fluctuation of the driving factor contribution degree sequencing calculated by the common method for multiple times, the method provided by the invention fully utilizes all MDI information calculated in the deep forest model construction process, can effectively reduce the influence of randomness on importance evaluation, and more accurately analyzes the driving mechanism of land utilization change.
Based on the land analysis method based on the deep forest model and the FLUS model, as shown in FIG. 8, an embodiment of the present invention provides a land analysis system based on the deep forest model and the FLUS model, where the system includes:
the data collection module 101 is used for acquiring driving factor data for driving land use distribution and multi-period land use distribution data;
the data sampling module 102 is configured to sample the multi-period land use distribution data according to a land expansion analysis strategy, and collect the driving factor data to obtain a training sample set;
the model training module 103 is used for inputting the training sample set into a deep forest model for training to obtain the data of the suitability of the development of each land change;
the data processing module 104 is used for inputting the land change development suitability data, the neighborhood effect, the conversion cost and the inertia coefficient into the FLUS model for simulation to obtain simulated land use distribution data;
and the data evaluation module 105 is used for calculating the driving factor data set through the deep forest model to obtain contribution degree weight data of each driving factor to each land change.
Based on the land analysis method and system based on the deep forest model and the FLUS model, as shown in FIG. 9, a computer device provided by the embodiment of the invention comprises a memory, a processor and a transceiver, which are connected through a bus; the memory is used to store a set of computer program instructions and data and may transmit the stored data to the processor, which may execute the program instructions stored by the memory to perform the steps of the above-described method.
Wherein the memory may comprise volatile memory or nonvolatile memory, or may comprise both volatile and nonvolatile memory; the processor may be a central processing unit, a microprocessor, an application specific integrated circuit, a programmable logic device, or a combination thereof. By way of example, and not limitation, the programmable logic device described above may be a complex programmable logic device, a field programmable gate array, general array logic, or any combination thereof.
In addition, the memory may be a physically separate unit or may be integrated with the processor.
It will be appreciated by those of ordinary skill in the art that the architecture shown in fig. 9 is a block diagram of only a portion of the architecture associated with the present solution and is not intended to limit the computing devices to which the present solution may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have the same arrangement of components.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
In summary, embodiments of the present invention provide a land analysis method and system based on a deep forest model and a plus model, which can obtain a more accurate land use change rule (development suitability) than a conventional machine learning method based on a relationship between a deep forest model modeling driving factor and a land use change, and at the same time, the model is easier to use than an existing deep neural network, can effectively improve the accuracy of the FLUS model land use change simulation, is less influenced by randomness, is more stable, and gives the plus model the ability to accurately analyze a land use driving mechanism.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for persons skilled in the art, numerous modifications and substitutions can be made without departing from the counting principle of the present invention, and these modifications and substitutions should also be considered as the protection scope of the present invention.

Claims (10)

1. A land analysis method based on a deep forest model and a FLUS model is characterized by comprising the following steps: the method comprises the following steps:
acquiring driving factor data for driving land use change and multi-period land use distribution data;
sampling the multi-period land utilization distribution data according to a land expansion analysis strategy, and collecting the driving factor data to obtain a training sample set;
inputting the training sample set into a deep forest model for training to obtain the development suitability data of each land change;
inputting the land change development suitability data, the neighborhood effect, the conversion cost and the inertia coefficient into a FLUS model for simulation to obtain simulated land utilization distribution data;
and calculating the driving factor data set through the deep forest model to obtain contribution degree weight data of each driving factor to each land change.
2. A land analysis method based on a deep forest model and a plus model as claimed in claim 1, characterized in that: the sampling the multi-period land utilization distribution data according to a land expansion analysis strategy, and collecting the driving factor data to obtain a training sample set, further comprising:
extracting an expansion area from the multi-period historical land utilization distribution data by adopting a superposition analysis method;
carrying out layered random sampling in the expansion area to obtain a sample position;
and constructing the training sample set according to the driving factor data and the land use distribution corresponding to the sample position.
3. A land analysis method based on a deep forest model and a plus model as claimed in claim 1, wherein the training sample set is input into the deep forest model for training to obtain the land change development suitability data, further comprising:
the training sample set is represented as follows:
{x 1 ,x 1 ,…,x i ,…,x N ,y j in which x i Representing the value of the ith driving factor, N being the number of driving factors, y j The land utilization types corresponding to the land changes, wherein M is the number of the land utilization types;
modeling a nonlinear mapping relation between a driving factor and land change through a deep forest model, and taking an average value of classification probability vectors output by a basic forest in the last layer of the deep forest model as a land utilization change development suitability value;
the classification probability vector is expressed as follows:
{P 1 ,P 2 ,…,P j ,…,P M }
the average value of the classification probability vectors is calculated by the following formula:
Figure FDA0003820169500000021
wherein, P j Is land use type y j Classification probability of H t (X) is the classification result of a single decision tree, and X is the driving factor X in the training sample set i And (4) corresponding characteristic vectors, wherein T is the number of decision trees in the common forest, and I is an indication function.
4. A land analysis method based on a depth forest model and a plus model as claimed in claim 3, wherein the calculating of the driving factor data set by the depth forest model to obtain the contribution weight data of each driving factor to each land change further comprises:
the deep forest model comprises a multilayer cascade structure, each layer is provided with a plurality of common forest models, and each common forest is composed of decision trees;
selecting a division rule, and calculating a kini coefficient of a node where the driving factor data set is located according to the division rule; wherein, the division rule is defined as θ, θ = (i, t), i represents a driving factor variable, t represents that a Node where a threshold driving factor data set is divided is defined as Node, and the kini coefficient is defined as G, which specifically includes:
Figure FDA0003820169500000031
Figure FDA0003820169500000032
dividing the driving factor data set into two new nodes according to the following division rule:
Node left ={(X,y)|x i ≤t}
Node right =Node/Node left
wherein, node left And Node right For the two new nodes after division, the number of samples contained in the two data sets is m respectively left And m right
The sum of the kini coefficients of the two new nodes is calculated by the following formula:
Figure FDA0003820169500000033
calculating an impurity degree reduction value of the division rule theta by using the following formula:
D(Node|θ)=G(Node)-G(Node|θ)
partitioning nodes by adopting a maximum non-purity reduction value, and constructing a decision tree: wherein the maximum non-purity reduction value division node is represented as theta * ,θ * =argmax θ D(Node|θ)。
5. A land analysis method based on a deep forest model and a FLUS model according to claim 4, characterized in that the maximum impure degree reduction value is used as a contribution weight of a driving factor, and the contribution degree of the driving factor to each land change is evaluated.
6. A land analysis method based on a deep forest model and a FLUS model as claimed in claim 5, wherein the using the maximum impure degree reduction value as a contribution weight of a driving factor further comprises:
training each basic forest model of each layer of forest model in the deep forest model to obtain the contribution weight of each driving factor in each basic forest;
calculating the average value of the contribution weight of each driving factor in each basic forest and the average value of the contribution weight of each driving factor in each layer of forest model;
averaging the average values of the contribution weights of the driving factors in the forest models of all layers to obtain integrated contribution degrees, and forming contribution degree weight data of the driving factors on land changes by the integrated contribution degrees.
7. A land analysis method based on a deep forest model and a FLUS model as claimed in claim 1, characterized in that the accuracy of the simulated land use distribution data of the DF-FLUS model is obtained by comparing and evaluating the real land use distribution data with the simulated land use distribution data simulated by the DF-FLUS model.
8. A land analysis system based on a deep forest model and a plus model, the system comprising:
the data collection module is used for acquiring driving factor data for driving land use change and multi-period land use distribution data;
the data sampling module is used for sampling the multi-period land utilization distribution data according to a land expansion analysis strategy, and collecting the driving factor data to obtain a training sample set;
the model training module is used for inputting the training sample set into a deep forest model for training to obtain the data of the suitability of the development of each land change;
the data processing module is used for inputting the land change development suitability data, the neighborhood effect, the conversion cost and the inertia coefficient into the FLUS model for simulation to obtain simulated land utilization distribution data;
and the data evaluation module is used for calculating the driving factor data set through the deep forest model to obtain contribution degree weight data of each driving factor to each land change.
9. A computer device, characterized by: comprising a processor coupled to a memory for storing a computer program and a memory for executing the computer program stored in the memory to cause the computer device to perform the method of any of claims 1 to 7.
10. A computer-readable storage medium, characterized in that: the computer-readable storage medium has stored thereon a computer program which, when executed, implements the method of any one of claims 1 to 7.
CN202211045930.XA 2022-08-29 2022-08-29 Land analysis method and system based on deep forest model and FLUS model Pending CN115374709A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211045930.XA CN115374709A (en) 2022-08-29 2022-08-29 Land analysis method and system based on deep forest model and FLUS model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211045930.XA CN115374709A (en) 2022-08-29 2022-08-29 Land analysis method and system based on deep forest model and FLUS model

Publications (1)

Publication Number Publication Date
CN115374709A true CN115374709A (en) 2022-11-22

Family

ID=84068904

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211045930.XA Pending CN115374709A (en) 2022-08-29 2022-08-29 Land analysis method and system based on deep forest model and FLUS model

Country Status (1)

Country Link
CN (1) CN115374709A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116739133A (en) * 2023-03-20 2023-09-12 北京师范大学 Regional reed NDVI pattern simulation prediction method based on natural-social dual-drive analysis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116739133A (en) * 2023-03-20 2023-09-12 北京师范大学 Regional reed NDVI pattern simulation prediction method based on natural-social dual-drive analysis
CN116739133B (en) * 2023-03-20 2024-06-04 北京师范大学 Regional reed NDVI pattern simulation prediction method based on natural-social dual-drive analysis

Similar Documents

Publication Publication Date Title
Yafouz et al. Hybrid deep learning model for ozone concentration prediction: comprehensive evaluation and comparison with various machine and deep learning algorithms
CN112598248B (en) Load prediction method, load prediction device, computer equipment and storage medium
Hu et al. Research and application of a hybrid model based on Meta learning strategy for wind power deterministic and probabilistic forecasting
Jalalkamali Using of hybrid fuzzy models to predict spatiotemporal groundwater quality parameters
CN111967696B (en) Neural network-based electric vehicle charging demand prediction method, system and device
Rahman et al. Discretization of continuous attributes through low frequency numerical values and attribute interdependency
Ismail et al. A hybrid model of self organizing maps and least square support vector machine for river flow forecasting
CN114944053A (en) Traffic flow prediction method based on spatio-temporal hypergraph neural network
CN102222313A (en) Urban evolution simulation structure cell model processing method based on kernel principal component analysis (KPCA)
CN102663495B (en) Neural net data generation method for nonlinear device modeling
Jafarzadeh et al. Examination of various feature selection approaches for daily precipitation downscaling in different climates
Zamim et al. Prediction of dust storms in construction projects using intelligent artificial neural network technology
Yu et al. PM2. 5 concentration forecasting through a novel multi-scale ensemble learning approach considering intercity synergy
CN115374709A (en) Land analysis method and system based on deep forest model and FLUS model
CN201716727U (en) Geographical simulation system based on remote sensing and GIS
Tyass et al. Wind speed prediction based on statistical and deep learning models
Zahraie et al. Exploring spatiotemporal meteorological correlations for basin scale meteorological drought forecasting using data mining methods
Tangrand Some new contributions to neural networks and wavelets with applications
Panneerselvam et al. ACBiGRU-DAO: attention convolutional bidirectional gated recurrent unit-based dynamic arithmetic optimization for air quality prediction
Cao et al. Probabilistic runoff forecasting considering stepwise decomposition framework and external factor integration structure
CN115600498A (en) Wind speed forecast correction method based on artificial neural network
Jiang et al. Discharge estimation based on machine learning
Cheema et al. Rainfall Prediction using Big Data Analytics: A Systematic Literature Review
Wu et al. A review of surrogate-assisted design optimization for improving urban wind environment
Zhang et al. Two‐stage nonparametric framework for missing data imputation, uncertainty quantification, and incorporation in system identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination