CN115293231A - Regional ecological harmony random forest prediction method - Google Patents
Regional ecological harmony random forest prediction method Download PDFInfo
- Publication number
- CN115293231A CN115293231A CN202210747133.XA CN202210747133A CN115293231A CN 115293231 A CN115293231 A CN 115293231A CN 202210747133 A CN202210747133 A CN 202210747133A CN 115293231 A CN115293231 A CN 115293231A
- Authority
- CN
- China
- Prior art keywords
- elements
- model
- random forest
- time
- human
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000007637 random forest analysis Methods 0.000 title claims abstract description 35
- 230000000694 effects Effects 0.000 claims abstract description 15
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims abstract description 14
- 238000010801 machine learning Methods 0.000 claims abstract description 13
- 238000011160 research Methods 0.000 claims abstract description 13
- 238000012512 characterization method Methods 0.000 claims abstract description 10
- 230000006870 function Effects 0.000 claims abstract description 10
- 238000013178 mathematical model Methods 0.000 claims abstract description 8
- 230000008569 process Effects 0.000 claims abstract description 8
- 230000008859 change Effects 0.000 claims abstract description 6
- 244000025254 Cannabis sativa Species 0.000 claims abstract description 5
- 238000012549 training Methods 0.000 claims description 15
- 230000007613 environmental effect Effects 0.000 claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 13
- 238000011156 evaluation Methods 0.000 claims description 13
- 241000209504 Poaceae Species 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000005516 engineering process Methods 0.000 claims description 6
- 238000011161 development Methods 0.000 claims description 5
- 238000005457 optimization Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 4
- 210000003484 anatomy Anatomy 0.000 claims description 3
- 238000004140 cleaning Methods 0.000 claims description 3
- 238000013480 data collection Methods 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000013138 pruning Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000002269 spontaneous effect Effects 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 230000009897 systematic effect Effects 0.000 claims description 3
- 230000036962 time dependent Effects 0.000 claims description 3
- 230000009467 reduction Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 206010027336 Menstruation delayed Diseases 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Game Theory and Decision Science (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Entrepreneurship & Innovation (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Educational Administration (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The prediction method of the regional ecological harmony random forest comprises the following steps: 1. finely describing lake and grass elements of the mountain and water forest fields from the underground to the surface by integrating different time scales; 2. finely and quantitatively interpreting the natural elements of nearly one hundred years in time intervals by combining long-time satellite remote sensing; 3. collecting human factor characterization data of different years in the research area range; 4. establishing a human activity factor function by a multivariate regression and machine learning method; 5. and analyzing the period and frequency of the curve, predicting the change characteristics of each element in the future by using a mathematical model, and predicting the influence of human activities on other elements. The invention has excellent accuracy; can operate efficiently on large data sets; input samples with high dimensional characteristics can be processed without dimension reduction; the importance of each feature on the classification problem can be evaluated; in the generation process, an unbiased estimation of an internal generation error can be obtained; good results can be obtained also for the default value problem.
Description
Technical Field
The invention belongs to the technical field of ecological environment prediction, and particularly relates to a prediction method of a regional ecological harmony random forest.
Background
The man-ground system of a certain city is analyzed, and the ecological harmonious three-dimensional city can be planned according to the estimation of the environmental bearing capacity of the city. Based on the technology, the interference of human activities on the human activities can be further fully analyzed through the intersection of natural science and social science in the fields of humanity, society, management and the like, and the social benefits are merged into the method for secondary evaluation by utilizing the social science evaluation method. According to local economic development requirements, requirements of different interest parties, human living environment and other factors, the most perfect planning combination is preferably selected and provided for planning personnel by combining a predictable economic and social model, and annual land strategies and specific repair measures are provided. Therefore, how to improve the prediction efficiency and prediction accuracy of regional ecological harmony is a problem which needs to be solved urgently.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a prediction method of a regional ecological harmony random forest; the method provides a key base line for comparison with the modern according to long-term and rapid environmental change evidences in the earth history, explores and analyzes the relationship of a natural system by using a visual and quantitative mathematical model, solves the problem of connecting the human history and the geological history by means of a high-precision year measurement technology spanning the geological time scale and the human time scale, abstracts the interference elements and relationship of the human elements to the lake and grass system of the mountain and water forest field, and realizes high-precision reduction and prediction of the interference elements and the relationship by using an artificial intelligence algorithm.
In order to solve the technical problems, the invention adopts the following technical scheme: the prediction method of the regional ecological harmony random forest comprises the following steps:
step one, aiming at multiple elements, synthesizing different time scales and finely describing lake and grass elements of the mountain and water forest fields from the underground to the surface;
secondly, finely and quantitatively interpreting natural elements of nearly one hundred years in different time intervals by combining long-time satellite remote sensing, wherein the natural elements comprise areas of mountains, water, forests, fields, lakes and grasses, the areas are calibrated by taking 1 year as a time unit, and human elements are comprehensively interpreted and comprise areas for buildings;
collecting human element characterization data in different years in the research area range, wherein the characterization data comprises population number, GDP and industrial development intensity, and is calibrated by taking 1 year as a time unit to comprehensively interpret human elements;
and fourthly, collecting the environmental bearing capacity of the research area within the research period according to historical data and expert judgment, and taking the environmental bearing capacity as a standard of subsequent model training. Establishing a human activity factor function by a multivariate regression and machine learning method; establishing a multi-scale fitting relation between human and natural multi-elements by a multivariate regression and machine learning method based on systematic thinking, wherein the multi-scale comprises a time scale, discussing the spontaneous evolution process of the natural environment and the influence of human activities on the process, and fitting into a time-dependent mathematical model, namely a curve function;
fifthly, setting the time as a certain future time, analyzing the period and the frequency of the curve, predicting the change characteristics of each element in the future by using a mathematical model, and predicting the influence of human activities on other elements; and (3) providing a lower limit of the environmental bearing capacity based on the work, delimiting the areas or proportions of mountains, water, forests, fields, lakes and grasses in the area, delimiting an ecological function guarantee baseline, an environmental quality safety baseline and a natural resource utilization upper line, and guiding the three-dimensional planning of the ecological harmonious city.
The first step is specifically: arranging dense shallow drills in key anatomical areas, and establishing a three-dimensional space model comprising elements of mountain and water forest fields, lakes and grasses through fine quantitative characterization of underground geologic bodies; by utilizing a plurality of drill holes and combining a high-precision dating technology, calibration is carried out by taking 100 years as a time unit from the late stage of a new world (> 5000 a) according to the classification of thousands of years to 1000 years ago, the ancient geography and the ancient environment pattern are recovered finely, and finally a four-dimensional space-time geological model with higher precision is established.
The multivariate regression and machine learning method in the fourth step is a random forest model algorithm, and the indexes of the geographic region condition elements of the random forest model algorithm are shown in the following table:
TABLE 1 geographical region situation element indices
Suppose the year of data collection is two periods, the first period being a new worldIn late stage to modern (7000 B.C-1950), 50 time points are provided, in the second period, 1951-2021 and 71 time points are provided, the first period is mainly used for constructing a geological evolution background, in 1951-2020, the region can be used as an evaluation result for planning an ecological harmonious stereo city (1: available; 0: unavailable), in 2021, the evaluation result is unknown and needs to be predicted through a trained prediction model; table 1 has 23 indices in total, so the normalized Z matrix size is 23 × 120, and the corresponding evaluation result Y matrix size is 120 × 1; known index Z of 2021 years to be predicted 2021 The matrix size is 23 x 1.
The random forest model algorithm adopts parameters in the table 1 to carry out data cleaning, such as processing missing values, smoothing noises, identifying or deleting outliers and normalizing to carry out data preprocessing; the method comprises the following steps:
the method comprises the steps of (1) random number generation, wherein the growth of each tree in a model is a key step, (2) prediction indexes MAE and MAPE are calculated, (3) random forest parameter optimization, (4) an optimal model is selected according to the principle of highest accuracy, and (5) the weight (non-zero real number) of each feature is directly calculated according to the optimal model generated by random forests, and a certain number of more important features are selected according to the principle of descending from large to small.
The step (1) comprises the following three main steps:
A. bootstrap sampling: if the training set size is N, extracting N training samples from the training set randomly and in a place back manner for each tree as the training set of the tree;
B. features are random: if the feature dimension of each sample is M, a constant M < < M is appointed, M feature subsets are randomly selected from the M features, and the optimal feature subset is selected from the M features when the tree is split each time;
C. each tree was grown to the greatest extent possible and had no pruning.
The step (2) is specifically as follows:as the true value of the resultIs an estimate of the result. The predictor MAE (Mean Absolute Error) represents the Mean Absolute Error, span: [0, + ∞); when the predicted value is completely matched with the true value, the predicted value is equal to 0, namely a perfect model; the larger the error, the larger the MAE value:
the prediction index MAPE (Mean Absolute percent Error) represents the Mean Absolute Percentage Error, value range: [0, + ∞); when the predicted value is completely consistent with the true value, the predicted value is equal to 0, namely a perfect model; the larger the error, the larger the MAE value:
the step (3) is specifically as follows: and adjusting the number of the established trees, the selection mode of the maximum features, the maximum depth of the trees, the number of samples required by the minimum splitting of the nodes, the minimum sample number of leaf nodes, whether to randomly select the most appropriate parameter combination and whether to perform Bayesian optimization by using a classical parameter adjusting method in machine learning.
By adopting the technical scheme, the invention has the following technical effects:
from the space perspective, the basic reflection of the natural resources constituting a certain area (such as a city) and various geographic elements of human living environment can be seen as information obtained by processing geographic information at three different depths according to different requirements and by perception, statistics and analysis. The method is a problem to be solved with respect to establishing a human-natural multi-factor multi-scale (time scale) fitting relationship.
The research of (Ma Mozhong, du Qingyun. System framework research of geographic national situation monitoring [ J ]. National and local resource science and technology management, 2011,28 (06): 104-111) can be summarized as natural environment elements, social and human factors and industrial and economic elements. An index system principle provided in a reference (Liu Kai. Ecological fragile type human-ground system evolution and sustainable development mode selection research [ D ]. Shandong university, 2017) aims to plan an ecological harmonious type three-dimensional city in a certain region. And establishing a prediction model by adopting indexes of 2 elements of natural environment and economic society and adopting a random forest method, and further obtaining weight analysis of the influence of the indexes on a prediction result.
Appropriate additions or deletions may be made to the indices listed in table 1, with the more features incorporated, the higher the accuracy. The indexes with large weights need to be reserved as far as possible, the running time of the features can be reduced, and a data set of partial important features selected according to a 95% threshold value is recommended. For the collection time, the same year can be collected at multiple time points, such as one data point per month, resulting in a large increase in sample size. Increasing the sample size may increase the accuracy of the prediction model.
The effect of random forest classification (error rate) is related to two factors: correlation of any two trees in a forest: the greater the correlation, the greater the error rate; classification ability of each tree in the forest: the stronger the classification capability of each tree, the lower the error rate of the entire forest. The number m of feature choices is reduced, and the relevance and classification capability of the tree are correspondingly reduced; increasing m, both also increase. The key issue is how to select the optimal m (or range), which is also a unique parameter for random forests.
The invention selects the random forest model algorithm, and has the following advantages: 1) In all current algorithms, the method has excellent accuracy; 2) Can operate efficiently on large data sets; 3) Input samples with high dimensional characteristics can be processed without dimension reduction; 4) The importance of each feature on the classification problem can be evaluated; 5) In the generation process, an unbiased estimation of an internal generation error can be obtained; 6) Good results can be obtained also for the default value problem.
Drawings
FIG. 1 is a schematic diagram of a random forest model;
FIG. 2 is a schematic flow chart of a random forest model algorithm;
FIG. 3 is a diagram illustrating a predictor weight arrangement;
FIG. 4 is a diagram illustrating a comparison of the results of the predictive model with the actual results.
Detailed Description
As shown in fig. 1-4, the method for predicting the regional ecological harmony random forest comprises the following steps:
step one, aiming at multiple elements, synthesizing different time scales and finely describing lake and grass elements of the mountain and water forest fields from the underground to the surface;
secondly, finely and quantitatively interpreting natural elements of nearly one hundred years in different time intervals by combining long-time satellite remote sensing, wherein the natural elements comprise areas of mountains, water, forests, fields, lakes and grasses, the areas are calibrated by taking 1 year as a time unit, and human elements are comprehensively interpreted and comprise areas for buildings;
collecting human element characterization data in different years in the research area range, wherein the characterization data comprises population number, GDP and industrial development intensity, and is calibrated by taking 1 year as a time unit to comprehensively interpret human elements;
and fourthly, collecting and judging the environmental bearing capacity of the research area within the research age according to historical data and expert judgment, and using the environmental bearing capacity as a standard of subsequent model training. Establishing a human activity factor function by a multivariate regression and machine learning method; establishing a multi-scale fitting relation between human and natural multi-elements by a multivariate regression and machine learning method based on systematic thinking, wherein the multi-scale comprises a time scale, discussing the spontaneous evolution process of the natural environment and the influence of human activities on the process, and fitting into a time-dependent mathematical model, namely a curve function;
fifthly, setting the time as a certain future time, analyzing the period and the frequency of the curve, predicting the change characteristics of each element in the future by using a mathematical model, and predicting the influence of human activities on other elements; and (3) providing a lower limit of the environmental bearing capacity based on the work, delimiting the areas or proportions of mountains, water, forests, fields, lakes and grasses in the area, delimiting an ecological function guarantee baseline, an environmental quality safety baseline and a natural resource utilization upper line, and guiding the three-dimensional planning of the ecological harmonious city.
The first step is specifically: arranging dense shallow drills in key anatomical areas, and establishing a three-dimensional space model comprising elements of mountain and water forest fields, lakes and grasses through fine quantitative characterization of underground geologic bodies; by utilizing a plurality of drill holes and combining a high-precision dating technology, from a new late period (> 5000 a), the method is divided according to the thousand-year grade, calibration is carried out by taking 100 years as a time unit before 1000 years ago, ancient geography and ancient environment patterns are recovered finely, and finally a four-dimensional space-time geological model with higher precision is established.
The multivariate regression and machine learning method in the fourth step is a random forest model algorithm, and the geographic region condition element indexes of the random forest model algorithm are shown in the following table:
TABLE 1 geographical region situation element indices
Assuming that the data collection year is two periods, the first period is from the brand new middle and late stages of the world to the modern (7000 B.C-1950), the total time points are 50, the second period is 1951-2021, the total time points are 71, the first period is mainly used for constructing a geological evolution background, the area can be used as an evaluation result for planning an ecological harmonious type stereo city in 1951-2020, the evaluation result is known (1: can be used; 0: can not be used), the evaluation result in 2021 is unknown, and the prediction is needed through a trained prediction model; table 1 has 23 indices in total, so the normalized Z matrix size is 23 × 120, and the corresponding evaluation result Y matrix size is 120 × 1; known index Z of 2021 years to be predicted 2021 The matrix size is 23 x 1.
The random forest model algorithm adopts parameters in the table 1 to carry out data cleaning, such as processing missing values, smoothing noises, identifying or deleting outliers and normalizing to carry out data preprocessing; the method comprises the following steps:
the method comprises the steps of (1) random number generation, wherein the growth of each tree in a model is a key step, (2) prediction indexes MAE and MAPE are calculated, (3) random forest parameter optimization, (4) an optimal model is selected according to the principle of highest accuracy, and (5) the weight (non-zero real number) of each feature is directly calculated according to the optimal model generated by random forests, and a certain number of more important features are selected according to the principle of descending from large to small.
The step (1) comprises the following three main steps:
A. bootstrap sampling: if the training set size is N, extracting N training samples from the training set randomly and in a place back manner for each tree as the training set of the tree;
B. features are random: if the feature dimension of each sample is M, a constant M < < M is appointed, M feature subsets are randomly selected from the M features, and the optimal feature subset is selected from the M features when the tree is split each time;
C. each tree was grown to the greatest extent possible and had no pruning.
The step (2) is specifically as follows:as the true value of the resultIs an estimate of the result. The prediction index MAE (Mean Absolute Error) represents the Mean Absolute Error, span: [0, + ∞); when the predicted value is completely matched with the true value, the predicted value is equal to 0, namely a perfect model; the larger the error, the larger the MAE value:
the prediction index MAPE (Mean Absolute percent Error) represents the Mean Absolute Percentage Error, value range: [0, + ∞); when the predicted value is completely matched with the true value, the predicted value is equal to 0, namely a perfect model; the larger the error, the larger the MAE value:
the step (3) is specifically as follows: and adjusting the number of the established trees, the selection mode of the maximum features, the maximum depth of the trees, the number of samples required by the minimum splitting of the nodes, the minimum sample number of leaf nodes, whether to randomly select the most appropriate parameter combination and whether to perform Bayesian optimization by using a classical parameter adjusting method in machine learning.
The present embodiment is not intended to limit the shape, material, structure, etc. of the present invention in any way, and any simple modification, equivalent change and modification made to the above embodiments according to the technical spirit of the present invention are within the scope of the technical solution of the present invention.
Claims (7)
1. The regional ecological harmony random forest prediction method is characterized by comprising the following steps: the method comprises the following steps:
step one, aiming at multiple elements, synthesizing different time scales and finely describing lake and grass elements of the mountain and water forest fields from the underground to the surface;
secondly, finely and quantitatively interpreting natural elements of nearly one hundred years in different time intervals by combining long-time satellite remote sensing, wherein the natural elements comprise areas of mountains, water, forests, fields, lakes and grasses, the areas are calibrated by taking 1 year as a time unit, and human elements are comprehensively interpreted and comprise areas for buildings;
collecting human element characterization data in different years in the research area range, wherein the characterization data comprises population number, GDP and industrial development intensity, and is calibrated by taking 1 year as a time unit to comprehensively interpret human elements;
fourthly, according to historical data and expert judgment, collecting environmental bearing capacity of a research area within a research age to judge, and using the environmental bearing capacity as a standard of subsequent model training; establishing a human activity factor function by a multivariate regression and machine learning method; based on systematic thinking, establishing a multi-scale fitting relationship between human and natural multi-elements through a multivariate regression and machine learning method, wherein the multi-scale comprises a time scale, discussing the spontaneous evolution process of the natural environment and the influence of human activities on the process, and fitting into a time-dependent mathematical model, namely a curve function;
fifthly, setting the time as a certain future time, analyzing the period and the frequency of the curve, predicting the change characteristics of each element in the future by using a mathematical model, and predicting the influence of human activities on other elements; and (3) providing a lower limit of the environmental bearing capacity based on the work, delimiting the areas or proportions of mountains, water, forests, fields, lakes and grasses in the area, delimiting an ecological function guarantee baseline, an environmental quality safety baseline and a natural resource utilization upper line, and guiding the three-dimensional planning of the ecological harmonious city.
2. The method of predicting regional ecological harmony random forest according to claim 1, wherein: the first step is specifically: arranging dense shallow drills in key anatomical areas, and establishing a three-dimensional space model comprising elements of mountain and water forest fields, lakes and grasses through fine quantitative characterization of underground geologic bodies; by utilizing a plurality of drill holes and combining a high-precision dating technology, calibration is carried out by taking 100 years as a time unit from the late stage of a new world (> 5000 a) according to the classification of thousands of years to 1000 years ago, the ancient geography and the ancient environment pattern are recovered finely, and finally a four-dimensional space-time geological model with higher precision is established.
3. The method of predicting regional ecological harmony random forest as claimed in claim 1, wherein: the multivariate regression and machine learning method in the fourth step is a random forest model algorithm, and the indexes of the geographic region condition elements of the random forest model algorithm are shown in the following table:
TABLE 1 geographical region situation element indices
Assuming that the data collection year is two periods, the first period is from the brand new middle and late stages of the world to the modern (7000 B.C-1950), the total time points are 50, the second period is 1951-2021, the total time points are 71, the first period is mainly used for constructing a geological evolution background, the area can be used as an evaluation result for planning an ecological harmonious type stereo city in 1951-2020, the evaluation result is known (1: can be used; 0: can not be used), the evaluation result in 2021 is unknown, and the prediction is needed through a trained prediction model; there are 23 indices in Table 1, so normalized Z momentsThe matrix size is 23 × 120, and the corresponding evaluation result Y matrix size is 120 × 1; known index Z of 2021 years to be predicted 2021 The matrix size is 23 x 1.
4. The method of predicting regional ecological harmony random forest as claimed in claim 3, wherein: the random forest model algorithm adopts parameters in the table 1 to carry out data cleaning, such as processing missing values, smoothing noises, identifying or deleting outliers and normalizing to carry out data preprocessing; the method comprises the following steps:
the method comprises the steps of (1) generating random numbers, wherein the growth of each tree in a model is a key step, (2) calculating prediction indexes MAE and MAPE, (3) optimizing random forest parameters, (4) selecting an optimal model according to the principle of highest accuracy, and (5) directly calculating the weight (non-zero real number) of each feature according to the optimal model generated by random forests, and selecting a certain number of more important features according to the principle of descending from large to small.
5. The method of predicting regional ecological harmony random forest as claimed in claim 4, wherein: the step (1) comprises the following three main steps:
A. bootstrap sampling: if the training set size is N, extracting N training samples from the training set randomly and in a place back manner for each tree as the training set of the tree;
B. features are random: if the feature dimension of each sample is M, a constant M < < M is appointed, M feature subsets are randomly selected from the M features, and the optimal feature subset is selected from the M features when the tree is split each time;
C. each tree was grown to the greatest extent possible and had no pruning.
6. The method of predicting regional ecological harmony random forest as claimed in claim 5, wherein: the step (2) is specifically as follows:as the true value of the resultIs an estimate of the result;
the predictor MAE (Mean Absolute Error) represents the Mean Absolute Error, span: [0, + ∞); when the predicted value is completely matched with the true value, the predicted value is equal to 0, namely a perfect model; the larger the error, the larger the MAE value:
the prediction index MAPE (Mean Absolute percent Error) represents the Mean Absolute Percentage Error, value range: [0, + ∞); when the predicted value is completely matched with the true value, the predicted value is equal to 0, namely a perfect model; the larger the error, the larger the MAE value:
7. the method of predicting regional ecological harmony random forest as claimed in claim 6, wherein: the step (3) is specifically as follows: and adjusting the number of the established trees, the selection mode of the maximum features, the maximum depth of the trees, the number of samples required by the minimum splitting of the nodes, the minimum sample number of leaf nodes, whether to randomly select the most appropriate parameter combination and whether to perform Bayesian optimization by using a classical parameter adjusting method in machine learning.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210747133.XA CN115293231A (en) | 2022-06-29 | 2022-06-29 | Regional ecological harmony random forest prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210747133.XA CN115293231A (en) | 2022-06-29 | 2022-06-29 | Regional ecological harmony random forest prediction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115293231A true CN115293231A (en) | 2022-11-04 |
Family
ID=83819879
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210747133.XA Pending CN115293231A (en) | 2022-06-29 | 2022-06-29 | Regional ecological harmony random forest prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115293231A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116823067A (en) * | 2023-08-29 | 2023-09-29 | 北控水务(中国)投资有限公司 | Method and device for determining water quality cleaning state of pipe network and electronic equipment |
CN118411056A (en) * | 2024-06-28 | 2024-07-30 | 贵州师范大学 | Ecological product information data sharing method for karst rural ecological system |
-
2022
- 2022-06-29 CN CN202210747133.XA patent/CN115293231A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116823067A (en) * | 2023-08-29 | 2023-09-29 | 北控水务(中国)投资有限公司 | Method and device for determining water quality cleaning state of pipe network and electronic equipment |
CN116823067B (en) * | 2023-08-29 | 2023-12-19 | 北控水务(中国)投资有限公司 | Method and device for determining water quality cleaning state of pipe network and electronic equipment |
CN118411056A (en) * | 2024-06-28 | 2024-07-30 | 贵州师范大学 | Ecological product information data sharing method for karst rural ecological system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115293231A (en) | Regional ecological harmony random forest prediction method | |
Snelder et al. | Development of an ecologic marine classification in the New Zealand region | |
Yan et al. | Many-objective robust decision making for water allocation under climate change | |
CN114971301B (en) | Ecological interference risk identification and evaluation method based on automatic parameter adjustment optimization model | |
Wu et al. | Applying of GA-BP neural network in the land ecological security evaluation | |
Znachor et al. | Changing environmental conditions underpin long-term patterns of phytoplankton in a freshwater reservoir | |
CN110070144A (en) | A kind of lake water quality prediction technique and system | |
Wan et al. | A landslide expert system: image classification through integration of data mining approaches for multi-category analysis | |
Anderson et al. | Occupancy modeling and estimation of the holiday darter species complex within the Etowah River system | |
CN118134680B (en) | Banyan research method and system | |
CN108764527B (en) | Screening method for soil organic carbon library time-space dynamic prediction optimal environment variables | |
CN111275065A (en) | Aquaculture space partitioning method based on marine environment multiple attributes | |
CN117787508A (en) | Model prediction-based carbon emission treatment method and system for building construction process | |
Fayer et al. | A temporal fusion transformer deep learning model for long-term streamflow forecasting: a case study in the funil reservoir, Southeast Brazil | |
Joshi et al. | Rainfall prediction using data visualisation techniques | |
Chen et al. | River ecological flow early warning forecasting using baseflow separation and machine learning in the Jiaojiang River Basin, Southeast China | |
CN110264010B (en) | Novel rural power saturation load prediction method | |
CN108154263B (en) | Monitoring and predicting method for natural water resource | |
CN114037332B (en) | Method and system for evaluating safety utilization effect of salt water resources | |
CN115293230A (en) | Regional ecological harmony LSTM algorithm prediction method | |
CN114841064A (en) | Drought disaster weather prediction method based on semi-supervised integrated learning | |
Singh et al. | Prognosis for crop yield production by data mining techniques in agriculture | |
CN110751398A (en) | Regional ecological quality evaluation method and device | |
Liu et al. | The uncertainties on the GIS based land suitability assessment for urban and rural planning | |
CN117709807B (en) | Kelp sink-increasing cultivation ecological benefit evaluation system and method based on ecological simulation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |