CN113869515A - Knowledge extraction method fusing genetic algorithm and decision tree algorithm - Google Patents
Knowledge extraction method fusing genetic algorithm and decision tree algorithm Download PDFInfo
- Publication number
- CN113869515A CN113869515A CN202111089955.5A CN202111089955A CN113869515A CN 113869515 A CN113869515 A CN 113869515A CN 202111089955 A CN202111089955 A CN 202111089955A CN 113869515 A CN113869515 A CN 113869515A
- Authority
- CN
- China
- Prior art keywords
- decision tree
- remote sensing
- genetic algorithm
- algorithm
- function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 91
- 238000003066 decision tree Methods 0.000 title claims abstract description 61
- 230000002068 genetic effect Effects 0.000 title claims abstract description 52
- 238000000605 extraction Methods 0.000 title claims abstract description 18
- 238000011156 evaluation Methods 0.000 claims abstract description 17
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000007637 random forest analysis Methods 0.000 claims abstract description 14
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 8
- 238000012216 screening Methods 0.000 claims abstract description 5
- 238000012163 sequencing technique Methods 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 55
- 238000005070 sampling Methods 0.000 claims description 14
- 230000015572 biosynthetic process Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000035772 mutation Effects 0.000 claims description 3
- 238000003786 synthesis reaction Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims 1
- 238000005457 optimization Methods 0.000 claims 1
- 238000000034 method Methods 0.000 abstract description 12
- 238000007418 data mining Methods 0.000 abstract description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- 238000004088 simulation Methods 0.000 description 2
- 230000005570 vertical transmission Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/24765—Rule-based classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Software Systems (AREA)
- Evolutionary Biology (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physiology (AREA)
- Genetics & Genomics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Image Analysis (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A knowledge extraction method fusing a genetic algorithm and a decision tree algorithm relates to the fields of remote sensing image-based classification, data mining and the like. The method comprises the steps of preparing remote sensing classification results and classification characteristic data; a training sample set is extracted layer by layer randomly; initializing a genetic algorithm, constructing a decision tree through a randomly generated gene combination and a sample, and simulating random characteristics and random sample characteristics of a random forest; screening a rule chain of interested categories for any constructed decision tree, and acquiring a classification result according to rules; comparing the screened classification result with the existing classification result to calculate the error rate as the fitness of the evaluation function, and recording the rule chain and the corresponding error rate; and (3) iterating the genetic algorithm until the specified times or meeting the convergence condition, sequencing a series of rules from small to large according to the error rate, and taking the rule with the minimum error rate as explicit knowledge. The invention can effectively convert implicit knowledge into understandable explicit knowledge, and has certain repeatability and robustness.
Description
Technical Field
The invention relates to a knowledge extraction method fusing a genetic algorithm and a decision tree algorithm, and relates to the technical fields of remote sensing image-based classification, data mining and the like.
Background
With the rise of cloud platforms such as Google Earth Engine and the like, remote sensing classification results based on different data and algorithms are increasing continuously. In the aspect of applying the remote sensing classification results, the existing method mostly selects samples from the results, and continues to generate new classification results based on an algorithm with black box attributes. In this process, knowledge about the classification is implicitly passed to the new classification through the sample, and people cannot acquire, understand and apply the knowledge, thereby hindering the progress of cognition.
In these algorithms with black box properties, decision tree classification approaches a simple description of things by humans. Taking water body identification in remote sensing classification as an example, people can separate water bodies and non-water bodies by normalizing threshold value threshold of water body index (NDWI), which can be expressed as NDWI > threshold in a simplified way, and the rule form generated by the decision tree algorithm is the same. For the threshold, the human threshold is summarized by a large number of practices, and the threshold of the decision tree is calculated by an algorithm. Then, a decision tree is constructed on the basis of the existing classification result, and the knowledge in the decision tree can be acquired.
Because the construction of the decision tree uses a greedy search strategy, the global optimum cannot be ensured, namely, the extracted knowledge has deviation. Through the sample random and characteristic random mode, the random forest algorithm formed by combining a series of decision trees has better classification performance relative to the decision trees. However, the rules given by a random forest algorithm that aggregates a series of decision trees are overly burdensome.
Disclosure of Invention
In order to solve the problem of acquiring understandable explicit knowledge from the existing classification results, the invention provides a knowledge extraction method fusing a genetic algorithm and a decision tree algorithm. The invention simulates sample randomness and feature randomness by a genetic algorithm and creates a series of decision trees to screen the rules of the sample randomness and feature randomness, thereby effectively converting implicit knowledge into explicit knowledge which can be understood by people.
The technical scheme adopted by the invention for solving the technical problem is as follows:
the invention discloses a knowledge extraction method fusing a genetic algorithm and a decision tree algorithm, which comprises the following steps:
preparing existing remote sensing classification result data and remote sensing classification characteristic data, and acquiring a remote sensing classification characteristic image of a remote sensing classification result area as the remote sensing classification characteristic data by using a Google Earth Engine cloud platform;
acquiring a training sample set in a layered random sampling mode according to the existing remote sensing classification result data;
initializing a genetic algorithm, and taking the gene quantity as the remote sensing classification characteristic quantity; simulating the features of a random forest algorithm by different genome sets generated by iteration of the genetic algorithm;
step four, establishing an evaluation function of the genetic algorithm, and obtaining samples in a training sample set in the function in a random sampling mode, wherein the proportion of the obtained samples is between 50% and 90%, and the samples are used for simulating the random of the samples of the random forest algorithm;
step five, in an evaluation function of a genetic algorithm, a decision tree is constructed by random features and random samples, and rules generated by the decision tree are extracted;
traversing rules generated by the decision tree in an evaluation function of a genetic algorithm, and screening the rules related to the interested land; classifying the relevant remote sensing classification characteristics according to the rules to obtain random characteristics and classification results corresponding to interested land type rules under random samples;
step seven, in an evaluation function of a genetic algorithm, comparing a classification result obtained based on a rule with an existing remote sensing classification result, calculating an error rate in a pixel-by-pixel comparison mode according to the classification result obtained by rule chain calculation and the existing remote sensing classification result, and taking the error rate as the fitness of the evaluation function;
step eight, repeating the iterative genetic algorithm from the step three to the step seven until reaching the specified iteration times or meeting the convergence condition, and stopping iteration to obtain a series of rules and corresponding error rates; and sequencing the rules according to the error rate from small to large to obtain the rule closest to the existing remote sensing classification result, and taking the rule with the minimum error rate as explicit knowledge.
Further, in the first step, a partial wetland interpretation result of the natural reserve of the Jilin sea country grade in 2020 is selected as the existing remote sensing classification result data; utilizing a Google Earth Engine cloud platform to obtain a Sentinel-1SAR image and a Sentinel-2MSI image of 5-10 months from Jilin to the sea state level natural reserve area in 2020, respectively carrying out median synthesis, calculating each classification characteristic, combining the classification characteristic with the waveband characteristic into a remote sensing classification characteristic image, and obtaining remote sensing classification characteristic data.
Further, in the second step, random sampling is carried out in a layered random sampling mode according to the existing remote sensing classification result data by utilizing a sampleStratified function of an R language reader package; randomly collecting wetland and non-wetland categories of a natural protection area of China sea level in Jilin of 2020 in an equal proportion to obtain a training sample set with the total sample capacity of 20000; and traversing the training sample set by using a rowFromcell function and a colFromcell function, and acquiring the corresponding characteristics of the samples according to the positions.
Further, in the third step, initializing a genetic algorithm by using an R language genalg rbga. bin function; the number of genes was set as the number of classification features, the population size was set as 200, the number of iterations was set as 100, and the mutation rate was set as 0.01.
Furthermore, in the fourth step, an evalFunc function of a genetic algorithm is written, 75% of samples are obtained by utilizing an R language createDataPartion function and are used for training a decision tree, and the rest parts are directly discarded, so that the random simulation samples are obtained.
Further, in the step five, in the evalFunc function, a decision tree is constructed by using the rpart function of the R language rpart packet, so as to obtain a decision tree under the conditions of random features and random samples.
Further, in the sixth step, in the evalFunc function, traversing the generation rule of the decision tree, selecting the rule related to the wetland to classify the remote sensing classification characteristic data, and obtaining a classification result.
Further, in the seventh step, in the evalFunc function, the classification result obtained according to the rule is compared with the existing remote sensing classification result, and the error rate is calculated to serve as the fitness of the evalFunc function.
And further, in the step eight, operating a rbga. bin function, continuously iterating and optimizing, and stopping iterating until the specified iteration times are reached or the convergence condition is met to obtain a series of rules and corresponding error rates, wherein the rule with the minimum error rate is the explicit knowledge for representing the wetland.
The invention has the beneficial effects that:
based on the existing remote sensing classification result and remote sensing classification characteristics, a training sample set is obtained in a layered random sampling mode, a genetic algorithm is utilized to simulate the characteristic randomness and sample randomness of a random forest algorithm, and a decision tree is constructed by randomly generated gene combinations and randomly selected partial samples to obtain the classification rules of the random forest algorithm; and screening a rule chain corresponding to the interested category for any constructed decision tree, acquiring a classification result according to the rule, and taking the error rate between the classification result obtained according to the rule and the existing classification result as the fitness of an evaluation function, so that the genetic algorithm is iteratively evolved in the optimal rule direction (namely the direction closest to the existing classification result). By sorting the rules corresponding to the categories of interest according to the error rate, the optimal rules (understandable explicit knowledge) can be obtained.
The invention can effectively convert implicit knowledge into understandable explicit knowledge, solves the problem of acquiring the understandable explicit knowledge from the existing classification result and solves the problems of deviation caused by acquiring the knowledge only by using a decision tree algorithm and incapability of acquiring the knowledge too frail by using a random forest algorithm.
The knowledge extraction method fusing the genetic algorithm and the decision tree algorithm is quick and effective, has repeatability and robustness, and has extremely important significance in the fields of remote sensing classification, data mining and the like.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 shows a partial wetland interpretation result of a natural reserve area of the national level of the open sea in Jilin 2020 as existing remote sensing classification result data.
Fig. 2 is a decision tree constructed under the conditions of feature randomness and sample randomness.
FIG. 3 shows the classification result obtained by the rule corresponding to the decision tree.
FIG. 4 is a minimum error rate from generation to generation for a genetic algorithm.
Fig. 5 shows classification results corresponding to explicit knowledge extracted by the knowledge extraction method combining genetic algorithm and decision tree algorithm and the existing wetland interpretation data.
Fig. 6 shows classification results obtained by using only the decision tree algorithm and the existing wetland interpretation data.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The invention discloses a knowledge extraction method fusing a genetic algorithm and a decision tree algorithm, which mainly comprises the following steps:
the method comprises the steps of firstly, preparing the existing remote sensing classification result data and remote sensing classification characteristic data, wherein a remote sensing classification characteristic image of a remote sensing classification result area is obtained by utilizing a Google Earth Engine cloud platform and is used as the remote sensing classification characteristic data. The existing remote sensing classification result data refers to vector or grid form interpretation results in remote sensing and refers to true value data sets in data mining.
The method specifically comprises the following steps: selecting a partial wetland interpretation result of a natural reserve area of Jilin sea country level in 2020 as existing remote sensing classification result data (as shown in figure 1); utilizing a Google Earth Engine cloud platform to obtain a Sentinel-1SAR image and a Sentinel-2MSI image of 5-10 months from Jilin to the sea state level natural reserve area in 2020, respectively carrying out median synthesis, calculating each classification characteristic, combining the classification characteristic with the waveband characteristic into a remote sensing classification characteristic image, and obtaining remote sensing classification characteristic data. Wherein, each classification characteristic can be designated by a user and mainly comprises a wave band characteristic, an index characteristic, a texture characteristic and the like.
And step two, acquiring a training sample set in a layered random sampling mode according to the existing remote sensing classification result data. The existing remote sensing classification result data comprise classification results of interested categories, and mainly comprise two-category classification, multi-category classification and other forms.
The method specifically comprises the following steps: randomly sampling in a layered random sampling mode according to the existing remote sensing classification result data by using a samplestrated function of an R language reader package; randomly collecting wetland and non-wetland categories of a natural protection area of Jilin to sea country in 2020 according to equal proportion (the volume ratio is 1:1) to obtain a training sample set with the total sample volume of 20000; and traversing the training sample set by using a rowFromcell function and a colFromcell function, and acquiring the corresponding characteristics of the samples according to the positions.
Initializing a genetic algorithm, and taking the gene quantity as the remote sensing classification characteristic quantity; the feature randomness of the random forest algorithm can be simulated through different genome combinations generated by the iteration of the genetic algorithm.
The method specifically comprises the following steps: the genetic algorithm is initialized by using the R language genalg chargba. The number of genes was set as the number of classification features, the population size was set as 200, the number of iterations was set as 100, and the mutation rate was set as 0.01. Wherein, the feature randomness of the random forest algorithm can be simulated through different genome combinations generated by the iteration of the genetic algorithm.
And step four, establishing an evaluation function of the genetic algorithm, and acquiring samples in the training sample set in the function in a random sampling mode, wherein the proportion of the acquired samples is between 50% and 90%, and the acquired samples are used for simulating the random sampling of the random forest algorithm.
The method specifically comprises the following steps: the evalFunc function of the genetic algorithm is written, wherein 75% of samples are obtained by using the R language createDataPartion function for training a decision tree, and the rest is directly discarded, so that the simulation samples are random.
And step five, in an evaluation function of the genetic algorithm, constructing a decision tree by using the random features and the random samples, and extracting rules generated by the decision tree.
The method specifically comprises the following steps: in the evalFunc function, a decision tree is constructed by utilizing the rpart function of the R language rpart packet, and a decision tree under the conditions of random features and random samples is obtained. As shown in fig. 2, the decision tree is drawn using rpart. plot packets, with leaf node 1 representing wetland and leaf node 2 representing non-wetland.
Traversing rules generated by the decision tree in an evaluation function of a genetic algorithm, and screening the rules related to the interested land; and classifying the relevant remote sensing classification characteristics according to the rules to obtain the random characteristics and classification results corresponding to the interested land type rules under the random sample. And traversing the rules of the decision tree, and selecting the rule chain corresponding to the leaf node of the interested category as the potential knowledge.
The method specifically comprises the following steps: in the evalFunc function, the generation rule of the decision tree is traversed, and the wetland-related rule is selected to classify the remote sensing classification characteristic data, namely the rule 'VV ≧ -17& B2/B4 ≧ 0.76' and the rule 'VV ≧ 17& B2/B4<0.76& (B8-B6)/(B8+ B6) < 0.046'), wherein VV represents the VV band (vertical transmission and vertical reception) of the Sentinel-1 satellite, B2 represents the band 2 (blue band) of the Sentinel-2 satellite, B4 represents the band 4 (red band) of the Sentinel-2 satellite, B6 represents the band 6 (red band 2) of the Sentinel-2 satellite, and B8 represents the band 8 (near infrared band) of the Sentinel-2 satellite, so as to obtain a classification result (as shown in FIG. 3).
And step seven, in an evaluation function of the genetic algorithm, comparing a classification result obtained based on the rule with the existing remote sensing classification result, calculating an error rate in a pixel-by-pixel comparison mode according to the classification result obtained by the rule chain calculation and the existing remote sensing classification result, and taking the error rate as the fitness of the evaluation function. For data mining, the error rate is calculated by comparing the classification result calculated according to the rule chain with the truth value data set, and the error rate is used as a fitness value.
The method specifically comprises the following steps: in the evalFunc function, the classification result obtained according to the rule is compared with the existing remote sensing classification result, and the error rate is calculated to serve as the fitness of the evalFunc function. In the case of multiple case zones, the maximum error rate is taken as the fitness of the evalFunc function. The error rate in this case zone (natural protection zone of Jilin Seisan grade 2020) was 32.3%. At the same time, the rules and corresponding error rates are saved to a file.
Step eight, iterating the genetic algorithm, namely repeating the step three to the step seven, stopping iteration until reaching the specified iteration times or meeting the convergence condition, and obtaining a series of rules and corresponding error rates; and sequencing the rules according to the error rate from small to large to obtain the rule closest to the existing remote sensing classification result, and taking the rule with the minimum error rate as explicit knowledge.
The method specifically comprises the following steps: and operating the rbga. bin function, continuously iterating and optimizing, and stopping iterating until the specified iteration times are reached or the convergence condition is met to obtain a series of rules and corresponding error rates. The rule with the minimum error rate is explicit knowledge for representing the wetland (B5<1501& VH/VV >: 1.521& (B3-B11)/(B3+ B11) >: -0.457) ", where B3 represents a band 3 (green band) of the Sentinel-2 satellite, B5 represents a band 5 (red band 1) of the Sentinel-2 satellite, B11 represents a band 11 (short-wave infrared band 1) of the Sentinel-2 satellite, and VH represents a VH band (vertical transmission, horizontal reception) of the Sentinel-1 satellite, and the error rate is 11.8%. The generation-by-generation minimum error rate in the iterative process is shown in fig. 4. If the error rate using only the decision tree algorithm is 14.8%, the rules obtained are "B11 <2637& B8-B4-B3> -1145& B8A <2993& B8> -328 & evolution < 169" and "B11 <2637& B8-B4-B3> -1145& B8A > -2993 & (B3-B12)/(B3+ B12) > -0.216". Wherein elevation represents elevation, B8A represents the 8A band (red side band 4) of the Sentinel-2 satellite, B12 represents the band 12 (short wave infrared band 2) of the Sentinel-2 satellite, and the classification rules obtained by using only decision trees and the classification results corresponding to explicit knowledge obtained by the present invention are shown in fig. 5 and 6, respectively. If a random forest algorithm is used, the rules can reach thousands, and the understandability of knowledge cannot be met. The invention is therefore particularly advantageous both in terms of error rate and in terms of the intelligibility of the acquired knowledge.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.
Claims (9)
1. A knowledge extraction method fusing a genetic algorithm and a decision tree algorithm is characterized by comprising the following steps:
preparing existing remote sensing classification result data and remote sensing classification characteristic data, and acquiring a remote sensing classification characteristic image of a remote sensing classification result area as the remote sensing classification characteristic data by using a Google Earth Engine cloud platform;
acquiring a training sample set in a layered random sampling mode according to the existing remote sensing classification result data;
initializing a genetic algorithm, and taking the gene quantity as the remote sensing classification characteristic quantity; simulating the features of a random forest algorithm by different genome sets generated by iteration of the genetic algorithm;
step four, establishing an evaluation function of the genetic algorithm, and obtaining samples in a training sample set in the function in a random sampling mode, wherein the proportion of the obtained samples is between 50% and 90%, and the samples are used for simulating the random of the samples of the random forest algorithm;
step five, in an evaluation function of a genetic algorithm, a decision tree is constructed by random features and random samples, and rules generated by the decision tree are extracted;
traversing rules generated by the decision tree in an evaluation function of a genetic algorithm, and screening the rules related to the interested land; classifying the relevant remote sensing classification characteristics according to the rules to obtain random characteristics and classification results corresponding to interested land type rules under random samples;
step seven, in an evaluation function of a genetic algorithm, comparing a classification result obtained based on a rule with an existing remote sensing classification result, calculating an error rate in a pixel-by-pixel comparison mode according to the classification result obtained by rule chain calculation and the existing remote sensing classification result, and taking the error rate as the fitness of the evaluation function;
step eight, repeating the iterative genetic algorithm from the step three to the step seven until reaching the specified iteration times or meeting the convergence condition, and stopping iteration to obtain a series of rules and corresponding error rates; and sequencing the rules according to the error rate from small to large to obtain the rule closest to the existing remote sensing classification result, and taking the rule with the minimum error rate as explicit knowledge.
2. The knowledge extraction method integrating the genetic algorithm and the decision tree algorithm is characterized in that in the first step, a partial wetland interpretation result of a natural conservation area of Jilin sea country level in 2020 is selected as the existing remote sensing classification result data; utilizing a Google Earth Engine cloud platform to obtain a Sentinel-1SAR image and a Sentinel-2MSI image of 5-10 months from Jilin to the sea state level natural reserve area in 2020, respectively carrying out median synthesis, calculating each classification characteristic, combining the classification characteristic with the waveband characteristic into a remote sensing classification characteristic image, and obtaining remote sensing classification characteristic data.
3. The knowledge extraction method integrating genetic algorithm and decision tree algorithm as claimed in claim 2, wherein in step two, the sampleStratified function of R language filter package is used to perform random sampling in a layered random sampling manner according to the existing remote sensing classification result data; randomly collecting wetland and non-wetland categories of a natural protection area of China sea level in Jilin of 2020 in an equal proportion to obtain a training sample set with the total sample capacity of 20000; and traversing the training sample set by using a rowFromcell function and a colFromcell function, and acquiring the corresponding characteristics of the samples according to the positions.
4. The knowledge extraction method for fusing genetic algorithm and decision tree algorithm as claimed in claim 3, wherein in step three, the genetic algorithm is initialized by using R language genalg chargba.bin function; the number of genes was set as the number of classification features, the population size was set as 200, the number of iterations was set as 100, and the mutation rate was set as 0.01.
5. The knowledge extraction method for fusing genetic algorithm and decision tree algorithm as claimed in claim 4, wherein in step four, evalFunc function of genetic algorithm is written, 75% of samples are obtained by R language createDataPartion function for training decision tree, and the rest is directly discarded, so as to simulate sample randomness.
6. The knowledge extraction method based on the fusion genetic algorithm and the decision tree algorithm as claimed in claim 5, wherein in step five, in evalFunc function, a decision tree is constructed by using rpart function of rpart package in R language, and a decision tree under the condition of feature randomness and sample randomness is obtained.
7. The knowledge extraction method integrating the genetic algorithm and the decision tree algorithm as claimed in claim 6, wherein in the sixth step, the generation rule of the decision tree is traversed in the evalFunc function, and the wetland related rule is selected to classify the remote sensing classification feature data to obtain the classification result.
8. The knowledge extraction method integrating genetic algorithm and decision tree algorithm as claimed in claim 7, wherein in step seven, the classification result obtained according to the rule is compared with the existing remote sensing classification result in evalFunc function, and the error rate is calculated as the fitness of the evalFunc function.
9. The knowledge extraction method integrating the genetic algorithm and the decision tree algorithm as claimed in claim 7, wherein in the eighth step, rbga. bin function is operated, iteration optimization is continued, until a specified iteration number is reached or a convergence condition is satisfied, iteration is stopped, a series of rules and corresponding error rates are obtained, and the rule with the minimum error rate is the explicit knowledge representing the wetland.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111089955.5A CN113869515B (en) | 2021-09-17 | 2021-09-17 | Knowledge extraction method integrating genetic algorithm and decision tree algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111089955.5A CN113869515B (en) | 2021-09-17 | 2021-09-17 | Knowledge extraction method integrating genetic algorithm and decision tree algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113869515A true CN113869515A (en) | 2021-12-31 |
CN113869515B CN113869515B (en) | 2024-04-05 |
Family
ID=78996373
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111089955.5A Active CN113869515B (en) | 2021-09-17 | 2021-09-17 | Knowledge extraction method integrating genetic algorithm and decision tree algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113869515B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105550578A (en) * | 2015-12-10 | 2016-05-04 | 上海电机学院 | Network anomaly classification rule extracting method based on feature selection and decision tree |
CN108038448A (en) * | 2017-12-13 | 2018-05-15 | 河南理工大学 | Semi-supervised random forest Hyperspectral Remote Sensing Imagery Classification method based on weighted entropy |
CN110516840A (en) * | 2019-07-15 | 2019-11-29 | 国网甘肃省电力公司电力科学研究院 | Short term prediction method based on the wind light generation power output for improving random forest method |
WO2021158989A1 (en) * | 2020-02-07 | 2021-08-12 | Lodo Therapeutics Corporation | Methods and apparatus for efficient and accurate assembly of long-read genomic sequences |
-
2021
- 2021-09-17 CN CN202111089955.5A patent/CN113869515B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105550578A (en) * | 2015-12-10 | 2016-05-04 | 上海电机学院 | Network anomaly classification rule extracting method based on feature selection and decision tree |
CN108038448A (en) * | 2017-12-13 | 2018-05-15 | 河南理工大学 | Semi-supervised random forest Hyperspectral Remote Sensing Imagery Classification method based on weighted entropy |
CN110516840A (en) * | 2019-07-15 | 2019-11-29 | 国网甘肃省电力公司电力科学研究院 | Short term prediction method based on the wind light generation power output for improving random forest method |
WO2021158989A1 (en) * | 2020-02-07 | 2021-08-12 | Lodo Therapeutics Corporation | Methods and apparatus for efficient and accurate assembly of long-read genomic sequences |
Non-Patent Citations (1)
Title |
---|
刘舒;姜琦刚;马;肖艳;李远华;崔璨;: "基于多目标遗传随机森林特征选择的面向对象湿地分类", 农业机械学报, no. 01, pages 1 - 9 * |
Also Published As
Publication number | Publication date |
---|---|
CN113869515B (en) | 2024-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110516596B (en) | Octave convolution-based spatial spectrum attention hyperspectral image classification method | |
CN107392925B (en) | Remote sensing image ground object classification method based on super-pixel coding and convolutional neural network | |
CN105488528B (en) | Neural network image classification method based on improving expert inquiry method | |
Fitzgerald et al. | Assessing the classification accuracy of multisource remote sensing data | |
CN108564606B (en) | Heterogeneous image block matching method based on image conversion | |
CN111428762B (en) | Interpretable remote sensing image ground feature classification method combining deep data learning and ontology knowledge reasoning | |
CN107944483B (en) | Multispectral image classification method based on dual-channel DCGAN and feature fusion | |
CN115249332B (en) | Hyperspectral image classification method and device based on space spectrum double-branch convolution network | |
CN108446616B (en) | Road extraction method based on full convolution neural network ensemble learning | |
CN111275640B (en) | Image enhancement method for fusing two-dimensional discrete wavelet transform and generation of countermeasure network | |
CN112949416B (en) | Supervised hyperspectral multiscale graph volume integral classification method | |
CN106339753A (en) | Method for effectively enhancing robustness of convolutional neural network | |
CN109948692A (en) | Picture detection method is generated based on the computer of multiple color spaces convolutional neural networks and random forest | |
CN113222068B (en) | Remote sensing image multi-label classification method based on adjacency matrix guidance label embedding | |
CN113988147B (en) | Multi-label classification method and device for remote sensing image scene based on graph network, and multi-label retrieval method and device | |
CN109671019B (en) | Remote sensing image sub-pixel mapping method based on multi-objective optimization algorithm and sparse expression | |
Li et al. | Incorporating open source data for Bayesian classification of urban land use from VHR stereo images | |
CN110211109B (en) | Image change detection method based on deep neural network structure optimization | |
CN111222534A (en) | Single-shot multi-frame detector optimization method based on bidirectional feature fusion and more balanced L1 loss | |
CN111695436B (en) | High spatial resolution remote sensing image scene classification method based on target enhancement | |
CN111222576B (en) | High-resolution remote sensing image classification method | |
CN116994071A (en) | Multispectral laser radar point cloud classification method based on self-adaptive spectrum residual error | |
CN113869515B (en) | Knowledge extraction method integrating genetic algorithm and decision tree algorithm | |
CN112329818A (en) | Hyperspectral image unsupervised classification method based on graph convolution network embedded representation | |
CN116823782A (en) | Reference-free image quality evaluation method based on graph convolution and multi-scale features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |