CN113869515A - Knowledge extraction method fusing genetic algorithm and decision tree algorithm - Google Patents

Knowledge extraction method fusing genetic algorithm and decision tree algorithm Download PDF

Info

Publication number
CN113869515A
CN113869515A CN202111089955.5A CN202111089955A CN113869515A CN 113869515 A CN113869515 A CN 113869515A CN 202111089955 A CN202111089955 A CN 202111089955A CN 113869515 A CN113869515 A CN 113869515A
Authority
CN
China
Prior art keywords
decision tree
remote sensing
genetic algorithm
algorithm
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111089955.5A
Other languages
Chinese (zh)
Other versions
CN113869515B (en
Inventor
赵传朋
王宗明
贾明明
任春颖
毛德华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Forestry Star Beijing Technology Information Co ltd
Original Assignee
China Forestry Star Beijing Technology Information Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Forestry Star Beijing Technology Information Co ltd filed Critical China Forestry Star Beijing Technology Information Co ltd
Priority to CN202111089955.5A priority Critical patent/CN113869515B/en
Publication of CN113869515A publication Critical patent/CN113869515A/en
Application granted granted Critical
Publication of CN113869515B publication Critical patent/CN113869515B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/24765Rule-based classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A knowledge extraction method fusing a genetic algorithm and a decision tree algorithm relates to the fields of remote sensing image-based classification, data mining and the like. The method comprises the steps of preparing remote sensing classification results and classification characteristic data; a training sample set is extracted layer by layer randomly; initializing a genetic algorithm, constructing a decision tree through a randomly generated gene combination and a sample, and simulating random characteristics and random sample characteristics of a random forest; screening a rule chain of interested categories for any constructed decision tree, and acquiring a classification result according to rules; comparing the screened classification result with the existing classification result to calculate the error rate as the fitness of the evaluation function, and recording the rule chain and the corresponding error rate; and (3) iterating the genetic algorithm until the specified times or meeting the convergence condition, sequencing a series of rules from small to large according to the error rate, and taking the rule with the minimum error rate as explicit knowledge. The invention can effectively convert implicit knowledge into understandable explicit knowledge, and has certain repeatability and robustness.

Description

Knowledge extraction method fusing genetic algorithm and decision tree algorithm
Technical Field
The invention relates to a knowledge extraction method fusing a genetic algorithm and a decision tree algorithm, and relates to the technical fields of remote sensing image-based classification, data mining and the like.
Background
With the rise of cloud platforms such as Google Earth Engine and the like, remote sensing classification results based on different data and algorithms are increasing continuously. In the aspect of applying the remote sensing classification results, the existing method mostly selects samples from the results, and continues to generate new classification results based on an algorithm with black box attributes. In this process, knowledge about the classification is implicitly passed to the new classification through the sample, and people cannot acquire, understand and apply the knowledge, thereby hindering the progress of cognition.
In these algorithms with black box properties, decision tree classification approaches a simple description of things by humans. Taking water body identification in remote sensing classification as an example, people can separate water bodies and non-water bodies by normalizing threshold value threshold of water body index (NDWI), which can be expressed as NDWI > threshold in a simplified way, and the rule form generated by the decision tree algorithm is the same. For the threshold, the human threshold is summarized by a large number of practices, and the threshold of the decision tree is calculated by an algorithm. Then, a decision tree is constructed on the basis of the existing classification result, and the knowledge in the decision tree can be acquired.
Because the construction of the decision tree uses a greedy search strategy, the global optimum cannot be ensured, namely, the extracted knowledge has deviation. Through the sample random and characteristic random mode, the random forest algorithm formed by combining a series of decision trees has better classification performance relative to the decision trees. However, the rules given by a random forest algorithm that aggregates a series of decision trees are overly burdensome.
Disclosure of Invention
In order to solve the problem of acquiring understandable explicit knowledge from the existing classification results, the invention provides a knowledge extraction method fusing a genetic algorithm and a decision tree algorithm. The invention simulates sample randomness and feature randomness by a genetic algorithm and creates a series of decision trees to screen the rules of the sample randomness and feature randomness, thereby effectively converting implicit knowledge into explicit knowledge which can be understood by people.
The technical scheme adopted by the invention for solving the technical problem is as follows:
the invention discloses a knowledge extraction method fusing a genetic algorithm and a decision tree algorithm, which comprises the following steps:
preparing existing remote sensing classification result data and remote sensing classification characteristic data, and acquiring a remote sensing classification characteristic image of a remote sensing classification result area as the remote sensing classification characteristic data by using a Google Earth Engine cloud platform;
acquiring a training sample set in a layered random sampling mode according to the existing remote sensing classification result data;
initializing a genetic algorithm, and taking the gene quantity as the remote sensing classification characteristic quantity; simulating the features of a random forest algorithm by different genome sets generated by iteration of the genetic algorithm;
step four, establishing an evaluation function of the genetic algorithm, and obtaining samples in a training sample set in the function in a random sampling mode, wherein the proportion of the obtained samples is between 50% and 90%, and the samples are used for simulating the random of the samples of the random forest algorithm;
step five, in an evaluation function of a genetic algorithm, a decision tree is constructed by random features and random samples, and rules generated by the decision tree are extracted;
traversing rules generated by the decision tree in an evaluation function of a genetic algorithm, and screening the rules related to the interested land; classifying the relevant remote sensing classification characteristics according to the rules to obtain random characteristics and classification results corresponding to interested land type rules under random samples;
step seven, in an evaluation function of a genetic algorithm, comparing a classification result obtained based on a rule with an existing remote sensing classification result, calculating an error rate in a pixel-by-pixel comparison mode according to the classification result obtained by rule chain calculation and the existing remote sensing classification result, and taking the error rate as the fitness of the evaluation function;
step eight, repeating the iterative genetic algorithm from the step three to the step seven until reaching the specified iteration times or meeting the convergence condition, and stopping iteration to obtain a series of rules and corresponding error rates; and sequencing the rules according to the error rate from small to large to obtain the rule closest to the existing remote sensing classification result, and taking the rule with the minimum error rate as explicit knowledge.
Further, in the first step, a partial wetland interpretation result of the natural reserve of the Jilin sea country grade in 2020 is selected as the existing remote sensing classification result data; utilizing a Google Earth Engine cloud platform to obtain a Sentinel-1SAR image and a Sentinel-2MSI image of 5-10 months from Jilin to the sea state level natural reserve area in 2020, respectively carrying out median synthesis, calculating each classification characteristic, combining the classification characteristic with the waveband characteristic into a remote sensing classification characteristic image, and obtaining remote sensing classification characteristic data.
Further, in the second step, random sampling is carried out in a layered random sampling mode according to the existing remote sensing classification result data by utilizing a sampleStratified function of an R language reader package; randomly collecting wetland and non-wetland categories of a natural protection area of China sea level in Jilin of 2020 in an equal proportion to obtain a training sample set with the total sample capacity of 20000; and traversing the training sample set by using a rowFromcell function and a colFromcell function, and acquiring the corresponding characteristics of the samples according to the positions.
Further, in the third step, initializing a genetic algorithm by using an R language genalg rbga. bin function; the number of genes was set as the number of classification features, the population size was set as 200, the number of iterations was set as 100, and the mutation rate was set as 0.01.
Furthermore, in the fourth step, an evalFunc function of a genetic algorithm is written, 75% of samples are obtained by utilizing an R language createDataPartion function and are used for training a decision tree, and the rest parts are directly discarded, so that the random simulation samples are obtained.
Further, in the step five, in the evalFunc function, a decision tree is constructed by using the rpart function of the R language rpart packet, so as to obtain a decision tree under the conditions of random features and random samples.
Further, in the sixth step, in the evalFunc function, traversing the generation rule of the decision tree, selecting the rule related to the wetland to classify the remote sensing classification characteristic data, and obtaining a classification result.
Further, in the seventh step, in the evalFunc function, the classification result obtained according to the rule is compared with the existing remote sensing classification result, and the error rate is calculated to serve as the fitness of the evalFunc function.
And further, in the step eight, operating a rbga. bin function, continuously iterating and optimizing, and stopping iterating until the specified iteration times are reached or the convergence condition is met to obtain a series of rules and corresponding error rates, wherein the rule with the minimum error rate is the explicit knowledge for representing the wetland.
The invention has the beneficial effects that:
based on the existing remote sensing classification result and remote sensing classification characteristics, a training sample set is obtained in a layered random sampling mode, a genetic algorithm is utilized to simulate the characteristic randomness and sample randomness of a random forest algorithm, and a decision tree is constructed by randomly generated gene combinations and randomly selected partial samples to obtain the classification rules of the random forest algorithm; and screening a rule chain corresponding to the interested category for any constructed decision tree, acquiring a classification result according to the rule, and taking the error rate between the classification result obtained according to the rule and the existing classification result as the fitness of an evaluation function, so that the genetic algorithm is iteratively evolved in the optimal rule direction (namely the direction closest to the existing classification result). By sorting the rules corresponding to the categories of interest according to the error rate, the optimal rules (understandable explicit knowledge) can be obtained.
The invention can effectively convert implicit knowledge into understandable explicit knowledge, solves the problem of acquiring the understandable explicit knowledge from the existing classification result and solves the problems of deviation caused by acquiring the knowledge only by using a decision tree algorithm and incapability of acquiring the knowledge too frail by using a random forest algorithm.
The knowledge extraction method fusing the genetic algorithm and the decision tree algorithm is quick and effective, has repeatability and robustness, and has extremely important significance in the fields of remote sensing classification, data mining and the like.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 shows a partial wetland interpretation result of a natural reserve area of the national level of the open sea in Jilin 2020 as existing remote sensing classification result data.
Fig. 2 is a decision tree constructed under the conditions of feature randomness and sample randomness.
FIG. 3 shows the classification result obtained by the rule corresponding to the decision tree.
FIG. 4 is a minimum error rate from generation to generation for a genetic algorithm.
Fig. 5 shows classification results corresponding to explicit knowledge extracted by the knowledge extraction method combining genetic algorithm and decision tree algorithm and the existing wetland interpretation data.
Fig. 6 shows classification results obtained by using only the decision tree algorithm and the existing wetland interpretation data.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The invention discloses a knowledge extraction method fusing a genetic algorithm and a decision tree algorithm, which mainly comprises the following steps:
the method comprises the steps of firstly, preparing the existing remote sensing classification result data and remote sensing classification characteristic data, wherein a remote sensing classification characteristic image of a remote sensing classification result area is obtained by utilizing a Google Earth Engine cloud platform and is used as the remote sensing classification characteristic data. The existing remote sensing classification result data refers to vector or grid form interpretation results in remote sensing and refers to true value data sets in data mining.
The method specifically comprises the following steps: selecting a partial wetland interpretation result of a natural reserve area of Jilin sea country level in 2020 as existing remote sensing classification result data (as shown in figure 1); utilizing a Google Earth Engine cloud platform to obtain a Sentinel-1SAR image and a Sentinel-2MSI image of 5-10 months from Jilin to the sea state level natural reserve area in 2020, respectively carrying out median synthesis, calculating each classification characteristic, combining the classification characteristic with the waveband characteristic into a remote sensing classification characteristic image, and obtaining remote sensing classification characteristic data. Wherein, each classification characteristic can be designated by a user and mainly comprises a wave band characteristic, an index characteristic, a texture characteristic and the like.
And step two, acquiring a training sample set in a layered random sampling mode according to the existing remote sensing classification result data. The existing remote sensing classification result data comprise classification results of interested categories, and mainly comprise two-category classification, multi-category classification and other forms.
The method specifically comprises the following steps: randomly sampling in a layered random sampling mode according to the existing remote sensing classification result data by using a samplestrated function of an R language reader package; randomly collecting wetland and non-wetland categories of a natural protection area of Jilin to sea country in 2020 according to equal proportion (the volume ratio is 1:1) to obtain a training sample set with the total sample volume of 20000; and traversing the training sample set by using a rowFromcell function and a colFromcell function, and acquiring the corresponding characteristics of the samples according to the positions.
Initializing a genetic algorithm, and taking the gene quantity as the remote sensing classification characteristic quantity; the feature randomness of the random forest algorithm can be simulated through different genome combinations generated by the iteration of the genetic algorithm.
The method specifically comprises the following steps: the genetic algorithm is initialized by using the R language genalg chargba. The number of genes was set as the number of classification features, the population size was set as 200, the number of iterations was set as 100, and the mutation rate was set as 0.01. Wherein, the feature randomness of the random forest algorithm can be simulated through different genome combinations generated by the iteration of the genetic algorithm.
And step four, establishing an evaluation function of the genetic algorithm, and acquiring samples in the training sample set in the function in a random sampling mode, wherein the proportion of the acquired samples is between 50% and 90%, and the acquired samples are used for simulating the random sampling of the random forest algorithm.
The method specifically comprises the following steps: the evalFunc function of the genetic algorithm is written, wherein 75% of samples are obtained by using the R language createDataPartion function for training a decision tree, and the rest is directly discarded, so that the simulation samples are random.
And step five, in an evaluation function of the genetic algorithm, constructing a decision tree by using the random features and the random samples, and extracting rules generated by the decision tree.
The method specifically comprises the following steps: in the evalFunc function, a decision tree is constructed by utilizing the rpart function of the R language rpart packet, and a decision tree under the conditions of random features and random samples is obtained. As shown in fig. 2, the decision tree is drawn using rpart. plot packets, with leaf node 1 representing wetland and leaf node 2 representing non-wetland.
Traversing rules generated by the decision tree in an evaluation function of a genetic algorithm, and screening the rules related to the interested land; and classifying the relevant remote sensing classification characteristics according to the rules to obtain the random characteristics and classification results corresponding to the interested land type rules under the random sample. And traversing the rules of the decision tree, and selecting the rule chain corresponding to the leaf node of the interested category as the potential knowledge.
The method specifically comprises the following steps: in the evalFunc function, the generation rule of the decision tree is traversed, and the wetland-related rule is selected to classify the remote sensing classification characteristic data, namely the rule 'VV ≧ -17& B2/B4 ≧ 0.76' and the rule 'VV ≧ 17& B2/B4<0.76& (B8-B6)/(B8+ B6) < 0.046'), wherein VV represents the VV band (vertical transmission and vertical reception) of the Sentinel-1 satellite, B2 represents the band 2 (blue band) of the Sentinel-2 satellite, B4 represents the band 4 (red band) of the Sentinel-2 satellite, B6 represents the band 6 (red band 2) of the Sentinel-2 satellite, and B8 represents the band 8 (near infrared band) of the Sentinel-2 satellite, so as to obtain a classification result (as shown in FIG. 3).
And step seven, in an evaluation function of the genetic algorithm, comparing a classification result obtained based on the rule with the existing remote sensing classification result, calculating an error rate in a pixel-by-pixel comparison mode according to the classification result obtained by the rule chain calculation and the existing remote sensing classification result, and taking the error rate as the fitness of the evaluation function. For data mining, the error rate is calculated by comparing the classification result calculated according to the rule chain with the truth value data set, and the error rate is used as a fitness value.
The method specifically comprises the following steps: in the evalFunc function, the classification result obtained according to the rule is compared with the existing remote sensing classification result, and the error rate is calculated to serve as the fitness of the evalFunc function. In the case of multiple case zones, the maximum error rate is taken as the fitness of the evalFunc function. The error rate in this case zone (natural protection zone of Jilin Seisan grade 2020) was 32.3%. At the same time, the rules and corresponding error rates are saved to a file.
Step eight, iterating the genetic algorithm, namely repeating the step three to the step seven, stopping iteration until reaching the specified iteration times or meeting the convergence condition, and obtaining a series of rules and corresponding error rates; and sequencing the rules according to the error rate from small to large to obtain the rule closest to the existing remote sensing classification result, and taking the rule with the minimum error rate as explicit knowledge.
The method specifically comprises the following steps: and operating the rbga. bin function, continuously iterating and optimizing, and stopping iterating until the specified iteration times are reached or the convergence condition is met to obtain a series of rules and corresponding error rates. The rule with the minimum error rate is explicit knowledge for representing the wetland (B5<1501& VH/VV >: 1.521& (B3-B11)/(B3+ B11) >: -0.457) ", where B3 represents a band 3 (green band) of the Sentinel-2 satellite, B5 represents a band 5 (red band 1) of the Sentinel-2 satellite, B11 represents a band 11 (short-wave infrared band 1) of the Sentinel-2 satellite, and VH represents a VH band (vertical transmission, horizontal reception) of the Sentinel-1 satellite, and the error rate is 11.8%. The generation-by-generation minimum error rate in the iterative process is shown in fig. 4. If the error rate using only the decision tree algorithm is 14.8%, the rules obtained are "B11 <2637& B8-B4-B3> -1145& B8A <2993& B8> -328 & evolution < 169" and "B11 <2637& B8-B4-B3> -1145& B8A > -2993 & (B3-B12)/(B3+ B12) > -0.216". Wherein elevation represents elevation, B8A represents the 8A band (red side band 4) of the Sentinel-2 satellite, B12 represents the band 12 (short wave infrared band 2) of the Sentinel-2 satellite, and the classification rules obtained by using only decision trees and the classification results corresponding to explicit knowledge obtained by the present invention are shown in fig. 5 and 6, respectively. If a random forest algorithm is used, the rules can reach thousands, and the understandability of knowledge cannot be met. The invention is therefore particularly advantageous both in terms of error rate and in terms of the intelligibility of the acquired knowledge.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (9)

1. A knowledge extraction method fusing a genetic algorithm and a decision tree algorithm is characterized by comprising the following steps:
preparing existing remote sensing classification result data and remote sensing classification characteristic data, and acquiring a remote sensing classification characteristic image of a remote sensing classification result area as the remote sensing classification characteristic data by using a Google Earth Engine cloud platform;
acquiring a training sample set in a layered random sampling mode according to the existing remote sensing classification result data;
initializing a genetic algorithm, and taking the gene quantity as the remote sensing classification characteristic quantity; simulating the features of a random forest algorithm by different genome sets generated by iteration of the genetic algorithm;
step four, establishing an evaluation function of the genetic algorithm, and obtaining samples in a training sample set in the function in a random sampling mode, wherein the proportion of the obtained samples is between 50% and 90%, and the samples are used for simulating the random of the samples of the random forest algorithm;
step five, in an evaluation function of a genetic algorithm, a decision tree is constructed by random features and random samples, and rules generated by the decision tree are extracted;
traversing rules generated by the decision tree in an evaluation function of a genetic algorithm, and screening the rules related to the interested land; classifying the relevant remote sensing classification characteristics according to the rules to obtain random characteristics and classification results corresponding to interested land type rules under random samples;
step seven, in an evaluation function of a genetic algorithm, comparing a classification result obtained based on a rule with an existing remote sensing classification result, calculating an error rate in a pixel-by-pixel comparison mode according to the classification result obtained by rule chain calculation and the existing remote sensing classification result, and taking the error rate as the fitness of the evaluation function;
step eight, repeating the iterative genetic algorithm from the step three to the step seven until reaching the specified iteration times or meeting the convergence condition, and stopping iteration to obtain a series of rules and corresponding error rates; and sequencing the rules according to the error rate from small to large to obtain the rule closest to the existing remote sensing classification result, and taking the rule with the minimum error rate as explicit knowledge.
2. The knowledge extraction method integrating the genetic algorithm and the decision tree algorithm is characterized in that in the first step, a partial wetland interpretation result of a natural conservation area of Jilin sea country level in 2020 is selected as the existing remote sensing classification result data; utilizing a Google Earth Engine cloud platform to obtain a Sentinel-1SAR image and a Sentinel-2MSI image of 5-10 months from Jilin to the sea state level natural reserve area in 2020, respectively carrying out median synthesis, calculating each classification characteristic, combining the classification characteristic with the waveband characteristic into a remote sensing classification characteristic image, and obtaining remote sensing classification characteristic data.
3. The knowledge extraction method integrating genetic algorithm and decision tree algorithm as claimed in claim 2, wherein in step two, the sampleStratified function of R language filter package is used to perform random sampling in a layered random sampling manner according to the existing remote sensing classification result data; randomly collecting wetland and non-wetland categories of a natural protection area of China sea level in Jilin of 2020 in an equal proportion to obtain a training sample set with the total sample capacity of 20000; and traversing the training sample set by using a rowFromcell function and a colFromcell function, and acquiring the corresponding characteristics of the samples according to the positions.
4. The knowledge extraction method for fusing genetic algorithm and decision tree algorithm as claimed in claim 3, wherein in step three, the genetic algorithm is initialized by using R language genalg chargba.bin function; the number of genes was set as the number of classification features, the population size was set as 200, the number of iterations was set as 100, and the mutation rate was set as 0.01.
5. The knowledge extraction method for fusing genetic algorithm and decision tree algorithm as claimed in claim 4, wherein in step four, evalFunc function of genetic algorithm is written, 75% of samples are obtained by R language createDataPartion function for training decision tree, and the rest is directly discarded, so as to simulate sample randomness.
6. The knowledge extraction method based on the fusion genetic algorithm and the decision tree algorithm as claimed in claim 5, wherein in step five, in evalFunc function, a decision tree is constructed by using rpart function of rpart package in R language, and a decision tree under the condition of feature randomness and sample randomness is obtained.
7. The knowledge extraction method integrating the genetic algorithm and the decision tree algorithm as claimed in claim 6, wherein in the sixth step, the generation rule of the decision tree is traversed in the evalFunc function, and the wetland related rule is selected to classify the remote sensing classification feature data to obtain the classification result.
8. The knowledge extraction method integrating genetic algorithm and decision tree algorithm as claimed in claim 7, wherein in step seven, the classification result obtained according to the rule is compared with the existing remote sensing classification result in evalFunc function, and the error rate is calculated as the fitness of the evalFunc function.
9. The knowledge extraction method integrating the genetic algorithm and the decision tree algorithm as claimed in claim 7, wherein in the eighth step, rbga. bin function is operated, iteration optimization is continued, until a specified iteration number is reached or a convergence condition is satisfied, iteration is stopped, a series of rules and corresponding error rates are obtained, and the rule with the minimum error rate is the explicit knowledge representing the wetland.
CN202111089955.5A 2021-09-17 2021-09-17 Knowledge extraction method integrating genetic algorithm and decision tree algorithm Active CN113869515B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111089955.5A CN113869515B (en) 2021-09-17 2021-09-17 Knowledge extraction method integrating genetic algorithm and decision tree algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111089955.5A CN113869515B (en) 2021-09-17 2021-09-17 Knowledge extraction method integrating genetic algorithm and decision tree algorithm

Publications (2)

Publication Number Publication Date
CN113869515A true CN113869515A (en) 2021-12-31
CN113869515B CN113869515B (en) 2024-04-05

Family

ID=78996373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111089955.5A Active CN113869515B (en) 2021-09-17 2021-09-17 Knowledge extraction method integrating genetic algorithm and decision tree algorithm

Country Status (1)

Country Link
CN (1) CN113869515B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550578A (en) * 2015-12-10 2016-05-04 上海电机学院 Network anomaly classification rule extracting method based on feature selection and decision tree
CN108038448A (en) * 2017-12-13 2018-05-15 河南理工大学 Semi-supervised random forest Hyperspectral Remote Sensing Imagery Classification method based on weighted entropy
CN110516840A (en) * 2019-07-15 2019-11-29 国网甘肃省电力公司电力科学研究院 Short term prediction method based on the wind light generation power output for improving random forest method
WO2021158989A1 (en) * 2020-02-07 2021-08-12 Lodo Therapeutics Corporation Methods and apparatus for efficient and accurate assembly of long-read genomic sequences

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550578A (en) * 2015-12-10 2016-05-04 上海电机学院 Network anomaly classification rule extracting method based on feature selection and decision tree
CN108038448A (en) * 2017-12-13 2018-05-15 河南理工大学 Semi-supervised random forest Hyperspectral Remote Sensing Imagery Classification method based on weighted entropy
CN110516840A (en) * 2019-07-15 2019-11-29 国网甘肃省电力公司电力科学研究院 Short term prediction method based on the wind light generation power output for improving random forest method
WO2021158989A1 (en) * 2020-02-07 2021-08-12 Lodo Therapeutics Corporation Methods and apparatus for efficient and accurate assembly of long-read genomic sequences

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘舒;姜琦刚;马;肖艳;李远华;崔璨;: "基于多目标遗传随机森林特征选择的面向对象湿地分类", 农业机械学报, no. 01, pages 1 - 9 *

Also Published As

Publication number Publication date
CN113869515B (en) 2024-04-05

Similar Documents

Publication Publication Date Title
CN110516596B (en) Octave convolution-based spatial spectrum attention hyperspectral image classification method
CN107392925B (en) Remote sensing image ground object classification method based on super-pixel coding and convolutional neural network
CN105488528B (en) Neural network image classification method based on improving expert inquiry method
Fitzgerald et al. Assessing the classification accuracy of multisource remote sensing data
CN108564606B (en) Heterogeneous image block matching method based on image conversion
CN111428762B (en) Interpretable remote sensing image ground feature classification method combining deep data learning and ontology knowledge reasoning
CN107944483B (en) Multispectral image classification method based on dual-channel DCGAN and feature fusion
CN115249332B (en) Hyperspectral image classification method and device based on space spectrum double-branch convolution network
CN108446616B (en) Road extraction method based on full convolution neural network ensemble learning
CN111275640B (en) Image enhancement method for fusing two-dimensional discrete wavelet transform and generation of countermeasure network
CN112949416B (en) Supervised hyperspectral multiscale graph volume integral classification method
CN106339753A (en) Method for effectively enhancing robustness of convolutional neural network
CN109948692A (en) Picture detection method is generated based on the computer of multiple color spaces convolutional neural networks and random forest
CN113222068B (en) Remote sensing image multi-label classification method based on adjacency matrix guidance label embedding
CN113988147B (en) Multi-label classification method and device for remote sensing image scene based on graph network, and multi-label retrieval method and device
CN109671019B (en) Remote sensing image sub-pixel mapping method based on multi-objective optimization algorithm and sparse expression
Li et al. Incorporating open source data for Bayesian classification of urban land use from VHR stereo images
CN110211109B (en) Image change detection method based on deep neural network structure optimization
CN111222534A (en) Single-shot multi-frame detector optimization method based on bidirectional feature fusion and more balanced L1 loss
CN111695436B (en) High spatial resolution remote sensing image scene classification method based on target enhancement
CN111222576B (en) High-resolution remote sensing image classification method
CN116994071A (en) Multispectral laser radar point cloud classification method based on self-adaptive spectrum residual error
CN113869515B (en) Knowledge extraction method integrating genetic algorithm and decision tree algorithm
CN112329818A (en) Hyperspectral image unsupervised classification method based on graph convolution network embedded representation
CN116823782A (en) Reference-free image quality evaluation method based on graph convolution and multi-scale features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant