CN116561554A - Feature extraction method, system, equipment and medium of boiler soot blower - Google Patents

Feature extraction method, system, equipment and medium of boiler soot blower Download PDF

Info

Publication number
CN116561554A
CN116561554A CN202310418102.4A CN202310418102A CN116561554A CN 116561554 A CN116561554 A CN 116561554A CN 202310418102 A CN202310418102 A CN 202310418102A CN 116561554 A CN116561554 A CN 116561554A
Authority
CN
China
Prior art keywords
feature
subset
soot blower
target
adopting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310418102.4A
Other languages
Chinese (zh)
Inventor
李德波
陈拓
陈智豪
陈兆立
金凤雏
王广雷
冯永新
宋景慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Southern Power Grid Power Technology Co Ltd
Original Assignee
China Southern Power Grid Power Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Southern Power Grid Power Technology Co Ltd filed Critical China Southern Power Grid Power Technology Co Ltd
Priority to CN202310418102.4A priority Critical patent/CN116561554A/en
Publication of CN116561554A publication Critical patent/CN116561554A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses a feature extraction method, a system, equipment and a medium of a boiler soot blower. And carrying out feature extraction on the target historical feature data set by adopting a particle swarm algorithm, generating a feature deletion set and an intermediate feature subset, and counting the iteration times in real time. And calculating the prediction precision corresponding to the intermediate feature subset by adopting a random forest algorithm, comparing the prediction precision and the iteration times with corresponding preset thresholds respectively, and determining the feature set corresponding to the boiler soot blower by combining the feature deletion set. The existing characteristic data set is dynamically adjusted, the characteristic extraction quality of the coal-fired power plant boiler soot blower is improved, the characteristic set required by the coal-fired power plant boiler soot blower can be effectively extracted, the training time of a boiler soot blower model is reduced, and the prediction accuracy of the boiler soot blower model is improved.

Description

Feature extraction method, system, equipment and medium of boiler soot blower
Technical Field
The invention relates to the technical field of feature extraction of boiler soot blowers, in particular to a method, a system, equipment and a medium for feature extraction of a boiler soot blower.
Background
The soot blower of the coal-fired power plant boiler is a complex and multi-parameter multivariable mutual coupling system, coal is a mixture and generally contains some mineral substances and inorganic substances, and the coal gradually deposits ash and slag on a heating surface of the boiler in the process of burning in the boiler, however, when the heating surface is stained with ash and slag, the safe and stable operation of the boiler soot blower is often not facilitated. Therefore, it is necessary to judge the ash deposition and slag formation in the boiler by the combustion conditions, thereby improving the combustion efficiency of the boiler and reducing the damage of slag formation to the boiler. When the boiler burns, the health state of the boiler heating surface can be modeled by monitoring the data obtained by the boiler burning, but in practical application, the number of variables for describing the accumulated ash of the boiler system is often hundreds to thousands, if the characterization parameters of the accumulated ash state are too large, the calculated amount of the model is too large, and the accuracy of the model is easily reduced due to the fitting phenomenon, so that the modeling of the accumulated ash state of the boiler is particularly important by adopting a proper feature extraction method to select key features.
Currently, feature extraction methods for coal-fired power plant boiler soot blowers in the industry include traditional feature extraction methods based on artificial experience and feature extraction methods based on artificial intelligence. The traditional characteristic extraction method based on artificial experience combines the principles of the heat flow rate in the hearth, the heat transfer mechanism theory and the like, and according to the change analysis of the soot blowing consumption of the related parameters of the boiler combustion process, the main characteristic parameters of the modeling of the heating surface of the boiler are manually selected, and the method is difficult to realize the rapid and accurate characteristic manual screening by the artificial experience.
Therefore, an artificial intelligence-based feature extraction method is generally adopted for feature selection of the soot blowers of the coal-fired power plant boiler. The coal-fired power plant soot blower characteristic extraction method based on the artificial intelligence technology does not need an accurate physical model between the optimization characteristics and the optimization targets, and can well solve the nonlinear complex characteristic extraction problem by mining a large number of collected unit historical data characteristics. However, the feature extraction method of the boiler soot blower does not combine with an optimization algorithm to obtain an optimal feature subset, and redundant information of high-dimensional features is not removed, so that the quality of the extracted features is poor.
Disclosure of Invention
The invention provides a feature extraction method, a system, equipment and a medium of a boiler soot blower, which solve the problems that the existing feature extraction method of the boiler soot blower does not combine with an optimization algorithm to obtain an optimal feature subset, and redundant information of high-dimensional features is not removed, so that the quality of the extracted features is poor.
The invention provides a feature extraction method of a boiler soot blower, which comprises the following steps:
acquiring an initial historical characteristic data set of a boiler soot blower;
performing feature dimension reduction on the initial historical feature data set by adopting a Pearson correlation coefficient to generate a target historical feature data set;
carrying out feature extraction on the target historical feature data set by adopting a particle swarm algorithm, generating a feature deletion set and an intermediate feature subset, and counting the iteration times in real time;
calculating the prediction precision corresponding to the intermediate feature subset by adopting a random forest algorithm;
and comparing the prediction precision and the iteration times with corresponding preset thresholds respectively, and determining a feature set corresponding to the boiler soot blower by combining the feature deletion set.
Optionally, the initial historical feature data set includes a plurality of initial historical feature data; the step of performing feature dimension reduction on the initial historical feature data set by adopting the pearson correlation coefficient to generate a target historical feature data set comprises the following steps:
Respectively calculating corresponding correlation coefficients between the initial historical characteristic data by adopting pearson correlation coefficients;
judging whether the absolute value of the correlation coefficient meets a preset correlation coefficient interval or not;
if yes, selecting one of a plurality of initial historical feature data corresponding to the correlation coefficient as target historical feature data according to a preset feature extraction standard;
if not, taking all initial historical characteristic data corresponding to the correlation coefficient as target historical characteristic data;
and constructing a target historical characteristic data set by adopting all the target historical characteristic data.
Optionally, the step of performing feature extraction on the target historical feature dataset by using a particle swarm algorithm to generate a feature deletion set, an intermediate feature subset and counting the iteration number in real time includes:
initializing the target historical characteristic data set in a population to generate an initial characteristic subset;
calculating the fitness corresponding to each feature in the initial feature subset by adopting a preset fitness formula;
the preset fitness formula is as follows:
wherein fit is the fitness, err is the error rate, dimension is the feature number corresponding to the initial feature subset, and D is the total feature number corresponding to the target historical feature data set;
Adopting all the characteristics of which the fitness does not meet the corresponding preset fitness threshold value to construct a characteristic deletion set;
and removing the features corresponding to the feature deletion set in the initial feature subset, generating an intermediate feature subset and counting the iteration times in real time.
Optionally, the step of calculating the prediction precision corresponding to the intermediate feature subset by adopting a random forest algorithm includes:
encoding the intermediate feature subset to generate a feature vector subset;
respectively selecting a plurality of characteristic vectors in the characteristic vector sub-sets to construct a plurality of training sets and test sets;
performing model construction by adopting all the training sets and the test sets to generate a random forest model;
and inputting the feature vector subset into the random forest model for data evaluation, and generating the prediction precision corresponding to the intermediate feature subset.
Optionally, the step of constructing a model by using all the training set and the test set to generate a random forest model includes:
respectively carrying out decision tree training on each training set by adopting a recursion splitting method to generate a decision tree corresponding to the training set;
testing the decision tree and the corresponding test set respectively to generate prediction data corresponding to the training set;
Judging whether all the predicted data meet a preset accurate threshold value or not;
if yes, constructing a random forest model by adopting all the decision trees;
if not, the step of respectively selecting a plurality of characteristic vectors in the characteristic vector sub-sets and constructing a plurality of training sets and test sets is carried out in a jumping manner until all the predicted data meet a preset accurate threshold value.
Optionally, the preset threshold includes a preset scoring standard, an iteration number threshold and an fitness threshold; the step of comparing the prediction precision and the iteration times with corresponding preset thresholds respectively and determining the feature set corresponding to the boiler soot blower by combining the feature deletion set comprises the following steps:
judging whether the prediction precision meets the preset scoring standard or not;
if yes, the intermediate feature subset is used as a target feature subset;
if not, constructing a target feature subset by adopting the intermediate feature subset and the feature deletion set;
judging whether the iteration times meet the iteration times threshold;
if yes, the target feature subset is used as a feature set corresponding to the boiler soot blower;
if not, determining a feature set corresponding to the boiler soot blower according to the subset fitness corresponding to the target feature subset and the fitness threshold.
Optionally, the step of determining the feature set corresponding to the boiler soot blower according to the subset fitness corresponding to the target feature subset and the fitness threshold value includes:
judging whether the subset fitness corresponding to the target feature subset meets the fitness threshold value or not;
if yes, the target feature subset is used as a feature set corresponding to the boiler soot blower;
if not, updating the speed and the position of each target feature in the target feature subset to generate a particle feature set;
and taking the particle characteristic set as the target historical characteristic data set, jumping to execute the step of carrying out characteristic extraction on the target historical characteristic data set by adopting a particle swarm algorithm, generating a characteristic deleting set, a middle characteristic subset and counting the iteration times in real time.
The invention also provides a feature extraction system of the boiler soot blower, which comprises:
the initial historical characteristic data set acquisition module is used for acquiring an initial historical characteristic data set of the boiler soot blower;
the target historical characteristic data set generation module is used for carrying out characteristic dimension reduction on the initial historical characteristic data set by adopting a Pearson correlation coefficient to generate a target historical characteristic data set;
The feature deleting set, the intermediate feature subset and the iteration number generating module are used for carrying out feature extraction on the target historical feature data set by adopting a particle swarm algorithm, generating a feature deleting set and an intermediate feature subset, and counting the iteration number in real time;
the prediction precision calculation module is used for calculating the prediction precision corresponding to the intermediate feature subset by adopting a random forest algorithm;
and the feature set determining module is used for comparing the prediction precision and the iteration times with corresponding preset thresholds respectively and determining a feature set corresponding to the boiler soot blower by combining the feature deletion set.
The invention also provides an electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the computer program when executed by the processor causes the processor to execute the steps of the feature extraction method for realizing the soot blower of any boiler.
The present invention also provides a computer readable storage medium having stored thereon a computer program which when executed implements a method of feature extraction for a boiler soot blower as any one of the above.
From the above technical scheme, the invention has the following advantages:
According to the method, the initial historical characteristic data set of the boiler soot blower is obtained, and the initial historical characteristic data set is subjected to characteristic dimension reduction by adopting the Pelson correlation coefficient, so that the target historical characteristic data set is generated. And carrying out feature extraction on the target historical feature data set by adopting a particle swarm algorithm, generating a feature deletion set and an intermediate feature subset, and counting the iteration times in real time. And calculating the prediction precision corresponding to the intermediate feature subset by adopting a random forest algorithm, comparing the prediction precision and the iteration times with corresponding preset thresholds respectively, and determining the feature set corresponding to the boiler soot blower by combining the feature deletion set. The method solves the technical problems that the existing feature extraction method of the boiler soot blower does not combine with an optimization algorithm to obtain an optimal feature subset, and redundant information of high-dimensional features is not removed, so that the quality of the extracted features is poor. And the pearson correlation coefficient is used for carrying out correlation coefficient analysis, so that a large part of characteristic parameters of the correlation coefficient are removed. And then extracting the features by using a particle swarm algorithm, and scoring the importance of the features by using a random forest algorithm. The existing characteristic data set is dynamically adjusted, the characteristic extraction quality of the coal-fired power plant boiler soot blower is improved, the characteristic set required by the coal-fired power plant boiler soot blower can be effectively extracted, the training time of a boiler soot blower model is reduced, and the prediction accuracy of the boiler soot blower model is improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained from these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a flow chart of steps of a method for extracting features of a soot blower of a boiler according to a first embodiment of the present invention;
FIG. 2 is a flow chart of steps of a method for extracting features of a soot blower of a boiler according to a second embodiment of the present invention;
FIG. 3 is a block flow diagram of a method for extracting features of a soot blower of a boiler according to a second embodiment of the present invention;
FIG. 4 is a block diagram of a feature extraction system for a soot blower of a boiler in accordance with a third embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a feature extraction method, a system, equipment and a medium of a boiler soot blower, which are used for solving the technical problems that the existing feature extraction method of the boiler soot blower does not combine an optimization algorithm to obtain an optimal feature subset, and redundant information of high-dimensional features is not removed, so that the quality of the extracted features is poor.
In order to make the objects, features and advantages of the present invention more comprehensible, the technical solutions in the embodiments of the present invention are described in detail below with reference to the accompanying drawings, and it is apparent that the embodiments described below are only some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for extracting features of a soot blower of a boiler according to an embodiment of the present invention.
The first embodiment of the invention provides a feature extraction method of a boiler soot blower, which comprises the following steps:
step 101, acquiring an initial historical characteristic data set of the boiler soot blower.
In an embodiment of the invention, an initial historical feature dataset of the boiler soot blower is obtained by collecting 15000 pieces of sample data from a Distributed Control System (DCS). The initial historical characteristic data set comprises 44 characteristic data of unit load, coal supply quantity, water supply flow, exhaust gas temperature, EF2 layer auxiliary wind 1 horn, F layer fuel wind 1 horn and the like, wherein the controllable variables are as follows: coal supply amount C, water supply flow F, primary air quantity (A1, A2, A3), secondary air quantity (S1-S22), primary air pressure (W1, W2); state variables: the unit load L, the oxygen content (O1-O7) at the outlet of the economizer, the oxygen content (S) of the flue gas and the flue gas temperature (T1-T6); output variable: furnace outlet NOx emissions (NX).
And 102, performing feature dimension reduction on the initial historical feature data set by adopting the Pearson correlation coefficient to generate a target historical feature data set.
In the embodiment of the invention, the pearson correlation coefficient is adopted to respectively calculate the corresponding correlation coefficient between the initial historical characteristic data. Judging whether the absolute value of the correlation coefficient meets a preset correlation coefficient interval or not, if so, selecting one from a plurality of initial historical characteristic data corresponding to the correlation coefficient as target historical characteristic data according to a preset characteristic extraction standard. If not, all initial historical characteristic data corresponding to the correlation numbers are used as target historical characteristic data. And finally, constructing a target historical characteristic data set by adopting all target historical characteristic data.
And 103, carrying out feature extraction on the target historical feature data set by adopting a particle swarm algorithm, generating a feature deletion set and an intermediate feature subset, and counting the iteration times in real time.
In the embodiment of the invention, the target historical characteristic data set is subjected to population initialization to generate an initial characteristic subset. And calculating the fitness corresponding to each feature in the initial feature subset by adopting a preset fitness formula, and constructing a feature deletion set by adopting the features of which all fitness does not meet the preset fitness threshold. And removing the features corresponding to the feature deletion set in the initial feature subset, generating an intermediate feature subset and counting the iteration times in real time.
And 104, calculating the prediction precision corresponding to the middle feature subset by adopting a random forest algorithm.
In the embodiment of the invention, the intermediate feature subset is encoded to generate a feature vector subset, and a plurality of feature vectors in the feature vector subset are respectively selected to construct a plurality of training sets and test sets. And (3) carrying out model construction by adopting all training sets and test sets, generating a random forest model, inputting the feature vector subsets into the random forest model for data evaluation, and generating the prediction precision corresponding to the intermediate feature subsets.
And 105, respectively comparing the prediction precision and the iteration times with corresponding preset thresholds, and determining a feature set corresponding to the boiler soot blower by combining the feature deletion set.
The preset threshold comprises a preset scoring standard, an iteration number threshold and an adaptability threshold. The preset scoring criteria refers to R2 scoring criteria. The iteration number threshold is a critical value of iteration update times by adopting a particle algorithm. The fitness threshold is a critical value that needs to be met by the fitness corresponding to each particle in the target feature subset.
In the embodiment of the invention, whether the prediction precision meets the preset scoring standard is judged, and if yes, the intermediate feature subset is used as the target feature subset. If not, constructing a target feature subset by adopting the intermediate feature subset and the feature deletion set. Judging whether the iteration times meet the iteration times threshold, if so, taking the target feature subset as a feature set corresponding to the boiler soot blower. If not, determining the feature set corresponding to the boiler soot blower based on the subset fitness and the fitness threshold corresponding to the target feature subset.
In the embodiment of the invention, the initial historical characteristic data set of the boiler soot blower is obtained, and the initial historical characteristic data set is subjected to characteristic dimension reduction by adopting the Pelson correlation coefficient, so that the target historical characteristic data set is generated. And carrying out feature extraction on the target historical feature data set by adopting a particle swarm algorithm, generating a feature deletion set and an intermediate feature subset, and counting the iteration times in real time. And calculating the prediction precision corresponding to the intermediate feature subset by adopting a random forest algorithm, comparing the prediction precision and the iteration times with corresponding preset thresholds respectively, and determining the feature set corresponding to the boiler soot blower by combining the feature deletion set. The method solves the technical problems that the existing feature extraction method of the boiler soot blower does not combine with an optimization algorithm to obtain an optimal feature subset, and redundant information of high-dimensional features is not removed, so that the quality of the extracted features is poor. And carrying out correlation analysis by using the Pearson correlation coefficient, and eliminating a large part of characteristic parameters of the correlation coefficient. Then adopting a PSO-RF based bidirectional feature selection method, wherein a particle swarm optimization algorithm (Particle Swarm Optimization, PSO) has good global searching capability and is used for searching an optimal feature subset in a feature space; random Forest (RF) algorithms are used to evaluate the merits of feature subsets. The existing characteristic data set is dynamically adjusted, the characteristic extraction quality of the coal-fired power plant boiler soot blower is improved, the characteristic set required by the coal-fired power plant boiler soot blower can be effectively extracted, the training time of a boiler soot blower model is reduced, and the prediction accuracy of the boiler soot blower model is improved.
Referring to fig. 2, fig. 2 is a flowchart illustrating a method for extracting features of a soot blower of a boiler according to a second embodiment of the present invention.
The feature extraction method of the soot blower of the boiler provided by the second embodiment of the invention comprises the following steps:
step 201, acquiring an initial historical characteristic data set of a boiler soot blower.
In the embodiment of the present invention, the implementation process of step 201 is similar to that of step 101, and will not be repeated here.
And 202, performing feature dimension reduction on the initial historical feature data set by adopting the Pearson correlation coefficient to generate a target historical feature data set.
Further, the initial set of historical feature data includes a plurality of initial historical feature data, and step 202 may include the sub-steps S11-S15 of:
s11, calculating corresponding correlation coefficients among the initial historical characteristic data by adopting the Pearson correlation coefficients.
And S12, judging whether the absolute value of the correlation coefficient meets a preset correlation coefficient interval.
And S13, if so, selecting one from a plurality of initial historical characteristic data corresponding to the correlation coefficient as target historical characteristic data according to a preset characteristic extraction standard.
And S14, if not, taking all initial historical characteristic data corresponding to the correlation numbers as target historical characteristic data.
S15, constructing a target historical characteristic data set by adopting all target historical characteristic data.
The preset correlation coefficient interval means that the correlation coefficient corresponding to the initial historical characteristic data is extremely strong. The correlation coefficient is very weak between 0.0 and 0.2, weak between 0.2 and 0.4, medium correlation between 0.4 and 0.6, strong correlation between 0.6 and 0.8, and very strong correlation between 0.8 and 1.0.
The preset feature extraction standard refers to one of two initial historical feature data corresponding to the correlation coefficient when the absolute value of the correlation coefficient is within a preset correlation coefficient interval according to actual needs. It may be arranged to retain the initial historical feature data with the smallest overall correlation coefficient of the two initial historical feature data.
In the embodiment of the invention, firstly, pearson correlation coefficient is adopted by a pearson correlation coefficient analysis method, namely, the correlation degree between initial historical characteristic data is calculated, and the initial historical characteristic data with higher correlation coefficient after calculation is selected, so that a target historical characteristic data set with lower correlation coefficient is obtained. By collecting 15000 sample data from a Distributed Control System (DCS), the initial historical characteristic data set includes 44 characteristic data of unit load, coal supply, water supply flow, exhaust temperature, EF2 layer auxiliary wind 1 horn, F layer fuel wind 1 horn, etc., wherein the controllable variables are: coal supply amount C, water supply flow F, primary air quantity (A1, A2, A3), secondary air quantity (S1-S22), primary air pressure (W1, W2); state variables: the unit load L, the oxygen content (O1-O7) at the outlet of the economizer, the oxygen content (S) of the flue gas and the flue gas temperature (T1-T6); output variable: furnace outlet NOx emissions (NX). And the thermodynamic diagram is adopted to show the correlation strength between the initial historical characteristic data, the correlation coefficient represents the correlation degree between the characteristics, the range is [ -1,1], and the positive and negative represent the correlation direction. The larger the absolute value, the stronger the correlation between the data sets. Feature dimension reduction is carried out on a given initial historical feature data set, so that the fact that the S1 and the S2, the S7 and the S6, the S13 and the S14, the S17 and the S19, the T4 and the T5 and the T6 have extremely strong correlations is obtained, one of the variables with extremely strong correlations is reserved between every two variables, and therefore the S1, the S7, the S13, the S17 and the T4 are reserved finally, and the S2, the S6, the S14, the S19, the T5 and the T6 are removed.
And 203, carrying out feature extraction on the target historical feature data set by adopting a particle swarm algorithm, generating a feature deletion set and an intermediate feature subset, and counting the iteration times in real time.
Further, step 203 may comprise the following sub-steps S21-S24:
s21, initializing the population of the target historical characteristic data set to generate an initial characteristic subset.
S22, calculating the fitness corresponding to each feature in the initial feature subset by adopting a preset fitness formula.
The preset fitness formula is as follows:
wherein fit is the fitness, err is the error rate, dimension is the feature number corresponding to the initial feature subset, and D is the total feature number corresponding to the target historical feature data set.
S23, constructing a feature deletion set by adopting features with all fitness not meeting the corresponding preset fitness threshold.
S24, removing the features corresponding to the feature deletion set in the initial feature subset, generating an intermediate feature subset and counting the iteration times in real time.
The preset fitness threshold value refers to a critical value for screening the optimal features.
In the embodiment of the invention, in the particle swarm algorithm, only 1 or 0,1 can be taken from each dimension of the characteristic value of the particles, wherein 1 indicates that the dimension characteristic is selected in the iteration of the round, 0 indicates that the dimension characteristic is discarded in the iteration of the round, and a characteristic selection result in the iteration of the round can be obtained for the population of the whole particles. At the beginning of the particle swarm algorithm, an initial value is assigned to all particles. The specific initial interval is [0,1], for the convenience of statistics and observation, when the initial value is greater than 0.5, the initial interval is assigned to 1, when the initial value is less than or equal to 0.5, the initial interval is assigned to 0, the record is carried out through x (i, j), i represents particles of the ith row, and j represents the jth dimension characteristic. It can be obtained that in the initialization process, the total number of the selected features accounts for 50% of the original features, and the general feature selection requirements are met. The specific formula is as follows:
From the practical situation, the feature selection needs to consider not only the dimension of the final feature selection, but also the accuracy of sample classification, and the two targets are used for jointly judging the selected goodness of the particles, so that the accuracy of the established fitness function accounts for 0.8, the feature dimension is 0.2, and the method is particularly shown in a preset fitness formula.
In order to avoid deviation of results caused by the fact that particles are trapped locally, random disturbance is carried out on the particles in the particle selection process, and the particle trapping local optimization is avoided, the specific process is that the value of x (i, j) is changed, the total number of the change accounts for 5% of the whole matrix, meanwhile, a random disturbance variable randper and the random number with the random number of disturbance times randcount are set in the iteration process, the random number value is 0 or 1, the initial value of the randcount is 0, and if the generated randper is 1, and the ratio of the random number to the total iteration times is not more than 30%, the random disturbance is carried out, otherwise, the random disturbance is not carried out. The specific modification formula is as follows:
therefore, the population is initialized through the target historical characteristic data set, an initial characteristic subset is generated, and the fitness corresponding to each characteristic in the initial characteristic subset is calculated by adopting a preset fitness formula. And constructing a feature deletion set by adopting the features of which all fitness does not meet the corresponding preset fitness threshold. Removing the features which are the same as the feature deletion set in the initial feature subset, generating an intermediate feature subset, and counting the iteration times in real time, namely searching from the feature full set by a particle swarm algorithm, deleting the features with the lowest importance from the initial feature subset each time, searching for a better feature subset, and obtaining the intermediate feature subset.
Step 204, the intermediate feature subset is encoded to generate a feature vector subset.
In an embodiment of the invention, each intermediate feature in the intermediate feature subset is encoded and converted into a feature vector that can be used in a machine learning algorithm. And constructing a feature vector subset by adopting all feature vectors.
Step 205, respectively selecting a plurality of feature vectors in the feature vector subsets, and constructing a plurality of training sets and testing sets.
In the embodiment of the invention, according to the preset extraction requirement, a part of feature vectors are extracted from the feature vector subsets respectively to construct a plurality of training sets, and according to the preset selection requirement, a part of features are randomly selected as candidate features to construct a plurality of test sets.
And 206, constructing a model by adopting all training sets and test sets to generate a random forest model.
Further, step 206 may include the following substeps S31-S35:
s31, respectively carrying out decision tree training on each training set by adopting a recursion splitting method, and generating decision trees corresponding to the training sets.
S32, testing the decision tree and the corresponding test set respectively to generate prediction data corresponding to the training set.
S33, judging whether all the predicted data meet a preset accurate threshold value.
And S34, if so, constructing a random forest model by adopting all decision trees.
And S35, if not, skipping and executing the steps of respectively selecting a plurality of characteristic vectors in the characteristic vector subsets and constructing a plurality of training sets and test sets until all the predicted data meet the preset accuracy threshold.
In the embodiment of the invention, a decision tree is constructed for each sampled training set and test set. The decision tree is typically constructed by recursively splitting, starting from the root node, recursively selecting the optimal features for splitting until the data samples on the leaf nodes reach a preset minimum number or depth reaches a preset maximum. And respectively testing the decision tree and the corresponding test set to generate prediction data corresponding to the training set, and when the prediction data corresponding to all the decision numbers meet a preset accurate threshold value, adopting all the decision trees to construct a random forest model. Otherwise, the step of respectively selecting a plurality of feature vectors in the feature vector sub-sets and constructing a plurality of training sets and test sets is carried out in a skip mode until all the predicted data meet a preset accurate threshold value.
And 207, inputting the feature vector subset into a random forest model for data evaluation, and generating the prediction precision corresponding to the intermediate feature subset.
In the embodiment of the invention, the feature vector subsets are input into a random forest model for data evaluation, the prediction results of each decision tree are integrated in a voting mode for each feature vector in the feature vector subsets, so that the prediction precision corresponding to the middle feature subsets is obtained, and the quality of the middle feature subsets is judged through the prediction accuracy.
And step 208, respectively comparing the prediction precision and the iteration times with corresponding preset thresholds, and determining a feature set corresponding to the boiler soot blower by combining the feature deletion set.
Further, the preset thresholds include a preset scoring criteria, a threshold number of iterations, and an fitness threshold, and step 208 may include the following substeps S41-S46:
s41, judging whether the prediction precision meets the preset scoring standard.
And S42, if yes, taking the intermediate feature subset as a target feature subset.
S43, if not, constructing a target feature subset by adopting the intermediate feature subset and the feature deletion set.
S44, judging whether the iteration times meet the iteration times threshold.
And S45, if yes, taking the target feature subset as a feature set corresponding to the boiler soot blower.
And S46, if not, determining a feature set corresponding to the boiler soot blower according to the subset fitness and the fitness threshold corresponding to the target feature subset.
In the embodiment of the invention, the R2 scoring standard is adopted to calculate the R2 scoring of the prediction precision and the true value corresponding to the intermediate feature subset, and the change curve of the R2 scoring obtained by the PSO-RF feature selection algorithm along with the increase of the iteration times can be drawn so as to observe the convergence and the stability of the algorithm. And when the prediction accuracy meets the preset scoring standard, taking the intermediate feature subset as a target feature subset. Otherwise, the newly deleted features are recovered, namely, the intermediate feature subset and the feature deletion set are adopted to construct a target feature subset. And judging that the iteration times are equal to the iteration times threshold, and if so, taking the target feature subset as a feature set corresponding to the boiler soot blower. Otherwise, determining the feature set corresponding to the boiler soot blower based on the subset fitness and the fitness threshold corresponding to the target feature subset. On the basis of guaranteeing the importance of the features, the feature fluctuation can be reduced by adding the dimension of the current feature subset as an evaluation index, and the selected optimal feature subset is guaranteed to have less redundancy and not to lose the prediction precision.
Further, step S46 may include the following sub-steps S461-S464:
s461, judging whether the subset fitness corresponding to the target feature subset meets a fitness threshold.
S462, if yes, taking the target feature subset as a feature set corresponding to the boiler soot blower.
And S463, if not, updating the speed and the position of each target feature in the target feature subset to generate a particle feature set.
S464, taking the particle characteristic set as a target historical characteristic data set, and jumping to execute the steps of carrying out characteristic extraction on the target historical characteristic data set by adopting a particle swarm algorithm, generating a characteristic deleting set and an intermediate characteristic subset, and counting the iteration times in real time.
Subset fitness is a generic term for fitness corresponding to each target feature in a subset of target features.
In the embodiment of the invention, when the subset fitness corresponding to the target feature subset meets the fitness threshold, the target feature subset is used as the feature set corresponding to the boiler soot blower. Otherwise, the following updating formula is adopted to update the speed and the position of each target feature in the target feature subset, and a particle feature set is generated. And taking the particle characteristic set as a target historical characteristic data set, jumping to execute the steps of carrying out characteristic extraction on the target historical characteristic data set by adopting a particle swarm algorithm, generating a characteristic deleting set and an intermediate characteristic subset, and counting the iteration times in real time.
The particle swarm optimization algorithm is a global random search algorithm based on swarm intelligence and is provided by simulating migration and clustering behaviors in the process of swarm foraging. On the basis of observing the activity behavior of the animal clusters, the particle swarm algorithm utilizes the sharing of the individual pair information in the clusters to enable the motion of the whole clusters to generate an unordered to ordered evolution process in a problem solving space, so that an optimal solution is obtained. In the particle swarm algorithm, birds are abstracted into individual particles, each particle has its own position and speed, two parameters of speed and position determine the specific situation of particle motion, the position determines the distance from the optimal value, and the magnitude of the change in speed iteration. For an N-dimensional optimization problem, if the size of the particle swarm is M, the position of the particle i (0<i.ltoreq.M) in the N-dimensional space is expressed as a vector, and the flying speed is expressed as a vector. Each particle has an adaptation value determined by the objective function and knows its best position (pbest) found so far and its present position. This can be seen as the particles' own flight experience. In addition to this, each particle knows the best position found by all particles in the whole population up to now, i.e. the population position vector (gbest), which is the best value in pbest, i.e. the individual position vector, which can be seen as experience of the particle companion. The particles determine the next movement by their own experience and the best experience among peers.
The particles update the speed and position according to the following formula, namely, the speed and position of each target feature in the target feature subset are updated by adopting the following formula, and a particle feature set is generated:
X i (t+1)=X i (t)+V i (t+1)
in the method, in the process of the invention,representing an inertia factor reflecting the movement habit of the particles, representing a tendency of the particles to maintain their previous velocity, when +.>When the algorithm is larger, the global searching capability of the algorithm is stronger, and the local searching capability is weaker. When->When the algorithm is smaller, the global searching capability of the algorithm is weaker, and the local searching capability of the algorithm is stronger; c 1 And c 2 For learning factors, also called acceleration constants, representing the tendency of a particle to learn from its own best experience and to learn from the best of all particles, by adjusting c 1 And c 2 The exploration and excavation energy of the algorithm can be balanced; rand () represents two random numbers between 0 and 1; v (V) i (t+1) represents a corresponding target velocity vector, V, after the target feature is updated i (t) represents an initial velocity vector corresponding to the ith target feature at the current time; x is X i (t+1) represents the corresponding target individual position after the target feature is updated; x is X i (t) represents an initial individual position corresponding to the i-th target feature at the current time; pbest (p best) i (t) represents an individual position vector corresponding to the i-th target feature at the current time; gbest (g best) i And the population position vector corresponding to the initial particle population at the current moment is represented.
In the embodiment of the invention, as shown in fig. 3, an initial historical characteristic data set of a boiler soot blower is acquired, and a pearson correlation coefficient is adopted to perform characteristic dimension reduction on the initial historical characteristic data set, so as to generate a target historical characteristic data set. And carrying out feature extraction on the target historical feature data set by adopting a particle swarm algorithm, initializing a population, generating an initial feature subset, calculating the fitness corresponding to each feature in the initial feature subset, generating a feature deletion set, an intermediate feature subset and counting the iteration times in real time. And coding the intermediate feature subsets to generate feature vector subsets, respectively selecting a plurality of feature vectors in the feature vector subsets for training, and constructing a plurality of training sets and test sets. And (5) constructing a model by adopting all training sets and test sets to generate a random forest model. And inputting the feature vector subset into a random forest model for data evaluation, and generating the prediction precision corresponding to the intermediate feature subset. Judging whether the iteration times meet the iteration times threshold, comparing the prediction precision and the iteration times with corresponding preset thresholds respectively, and determining a feature set corresponding to the boiler soot blower by combining the feature deletion set. And the pearson correlation coefficient is used for carrying out correlation coefficient analysis, so that a large part of characteristic parameters of the correlation coefficient are removed. And then extracting the features by using a particle swarm algorithm, and scoring the importance of the features by using a random forest algorithm. The particle swarm algorithm starts searching from the whole set of features, deletes the features with the lowest importance from the current feature subset each time, and searches for a better feature subset; and calculating according to the prediction accuracy of the current feature subset by using an RF algorithm, and if the accuracy is reduced, recovering the feature which is just deleted, and circulating until the set iteration times are reached or the fitness value of the particles reaches a preset threshold value. On the basis of guaranteeing the importance of the features, the feature fluctuation can be reduced by adding the dimension of the current feature subset as an evaluation index, and the selected optimal feature subset is guaranteed to have less redundancy and not to lose the prediction precision.
Referring to fig. 4, fig. 4 is a block diagram of a feature extraction system of a soot blower of a boiler according to a third embodiment of the present invention.
The embodiment of the invention provides a feature extraction system of a boiler soot blower, which comprises the following components:
an initial historical feature data set acquisition module 401 for acquiring an initial historical feature data set of the boiler soot blower.
The target historical feature data set generating module 402 is configured to perform feature dimension reduction on the initial historical feature data set by using pearson correlation coefficients, so as to generate a target historical feature data set.
The feature deletion set, the intermediate feature subset and the iteration number generation module 403 are configured to perform feature extraction on the target historical feature dataset by using a particle swarm algorithm, generate a feature deletion set, an intermediate feature subset and count the iteration number in real time.
And the prediction precision calculation module 404 is configured to calculate the prediction precision corresponding to the intermediate feature subset by using a random forest algorithm.
The feature set determining module 405 is configured to compare the prediction accuracy and the iteration number with corresponding preset thresholds, and determine a feature set corresponding to the boiler soot blower by combining the feature deletion set.
Optionally, the initial historical feature data set includes a plurality of initial historical feature data, and the target historical feature data set generation module 402 includes:
And the correlation coefficient calculation module is used for calculating the correlation coefficient corresponding to each initial historical characteristic data by adopting the Pearson correlation coefficient.
And the correlation coefficient judging module is used for judging whether the absolute value of the correlation coefficient meets the preset correlation coefficient interval.
And the first module is used for selecting one from a plurality of initial historical characteristic data corresponding to the correlation coefficient as target historical characteristic data according to a preset characteristic extraction standard if the target historical characteristic data is determined.
And the target historical characteristic data determining second module is used for taking all initial historical characteristic data corresponding to the correlation number as target historical characteristic data if not.
And the target historical characteristic data set generation sub-module is used for constructing a target historical characteristic data set by adopting all target historical characteristic data.
Optionally, the feature deletion set, the intermediate feature subset, and the iteration number generation module 403 includes:
and the initial feature subset generation module is used for carrying out population initialization on the target historical feature data set to generate an initial feature subset.
And the fitness calculation module is used for calculating fitness corresponding to each feature in the initial feature subset by adopting a preset fitness formula.
The preset fitness formula is as follows:
wherein fit is the fitness, err is the error rate, dimension is the feature number corresponding to the initial feature subset, and D is the total feature number corresponding to the target historical feature data set.
The feature deletion set construction module is used for constructing a feature deletion set by adopting features of which all fitness does not meet the preset fitness threshold.
And the intermediate feature subset and iteration number generation module is used for removing the features corresponding to the feature deletion set in the initial feature subset, generating an intermediate feature subset and counting the iteration number in real time.
Optionally, the prediction accuracy calculation module 404 includes:
and the feature vector subset generating module is used for encoding the intermediate feature subsets to generate feature vector subsets.
The training set and test set construction module is used for respectively selecting a plurality of characteristic vectors in the characteristic vector sub-sets to construct a plurality of training sets and test sets.
And the random forest model generation module is used for carrying out model construction by adopting all training sets and test sets to generate a random forest model.
And the prediction accuracy calculation sub-module is used for inputting the feature vector subset into the random forest model for data evaluation and generating the prediction accuracy corresponding to the intermediate feature subset.
Alternatively, the random forest model generation module may perform the steps of:
respectively carrying out decision tree training on each training set by adopting a recursion splitting method to generate decision trees corresponding to the training sets;
testing the decision tree and the corresponding test set respectively to generate prediction data corresponding to the training set;
judging whether all the predicted data meet a preset accurate threshold value or not;
if yes, constructing a random forest model by adopting all decision trees;
if not, the step of jumping to execute the steps of respectively selecting a plurality of characteristic vectors in the characteristic vector sub-sets and constructing a plurality of training sets and test sets until all the predicted data meet the preset accurate threshold value.
Optionally, the preset threshold includes a preset scoring criterion, an iteration number threshold, and an fitness threshold, and the feature set determining module 405 includes:
and the prediction precision judging module is used for judging whether the prediction precision meets the preset scoring standard.
The target feature subset determines a first module for, if so, taking the intermediate feature subset as the target feature subset.
And the target feature subset determining second module is used for constructing the target feature subset by adopting the intermediate feature subset and the feature deletion set if not.
The iteration number judging module is used for judging whether the iteration number meets an iteration number threshold.
And determining a first submodule by the feature set, wherein if yes, the target feature subset is used as the feature set corresponding to the boiler soot blower.
And the feature set determining second sub-module is used for determining the feature set corresponding to the boiler soot blower according to the subset fitness and the fitness threshold corresponding to the target feature subset if not.
Alternatively, the feature set determination second sub-module may perform the steps of:
judging whether the subset fitness corresponding to the target feature subset meets a fitness threshold value or not;
if yes, the target feature subset is used as a feature set corresponding to the boiler soot blower;
if not, updating the speed and the position of each target feature in the target feature subset to generate a particle feature set;
taking the particle characteristic set as a target historical characteristic data set, and jumping to execute the steps of carrying out characteristic extraction on the target historical characteristic data set by adopting a particle swarm algorithm, generating a characteristic deleting set and an intermediate characteristic subset, and counting the iteration times in real time.
The embodiment of the invention also provides electronic equipment, which comprises: a memory and a processor, the memory storing a computer program; the computer program, when executed by a processor, causes the processor to perform the method of feature extraction of a boiler soot blower of any of the embodiments described above.
The memory may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. The memory has memory space for program code to perform any of the method steps described above. For example, the memory space for the program code may include individual program code for implementing the various steps in the above method, respectively. The program code can be read from or written to one or more computer program products. These computer program products comprise a program code carrier such as a hard disk, a Compact Disc (CD), a memory card or a floppy disk. The program code may be compressed, for example, in a suitable form. These codes, when executed by a computing processing device, cause the computing processing device to perform the steps in the method of feature extraction of a boiler soot blower described above.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the method for extracting the characteristics of the boiler soot blower according to any one of the above embodiments.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for extracting characteristics of a boiler soot blower, comprising:
acquiring an initial historical characteristic data set of a boiler soot blower;
performing feature dimension reduction on the initial historical feature data set by adopting a Pearson correlation coefficient to generate a target historical feature data set;
carrying out feature extraction on the target historical feature data set by adopting a particle swarm algorithm, generating a feature deletion set and an intermediate feature subset, and counting the iteration times in real time;
calculating the prediction precision corresponding to the intermediate feature subset by adopting a random forest algorithm;
and comparing the prediction precision and the iteration times with corresponding preset thresholds respectively, and determining a feature set corresponding to the boiler soot blower by combining the feature deletion set.
2. The method of feature extraction of a boiler soot blower of claim 1, wherein said initial historical feature data set comprises a plurality of initial historical feature data; the step of performing feature dimension reduction on the initial historical feature data set by adopting the pearson correlation coefficient to generate a target historical feature data set comprises the following steps:
respectively calculating corresponding correlation coefficients between the initial historical characteristic data by adopting pearson correlation coefficients;
judging whether the absolute value of the correlation coefficient meets a preset correlation coefficient interval or not;
if yes, selecting one of a plurality of initial historical feature data corresponding to the correlation coefficient as target historical feature data according to a preset feature extraction standard;
if not, taking all initial historical characteristic data corresponding to the correlation coefficient as target historical characteristic data;
and constructing a target historical characteristic data set by adopting all the target historical characteristic data.
3. The method for extracting features of a boiler soot blower according to claim 1, wherein said step of performing feature extraction on said target historical feature dataset using a particle swarm algorithm to generate a feature deletion set, an intermediate feature subset, and counting the number of iterations in real time comprises:
Initializing the target historical characteristic data set in a population to generate an initial characteristic subset;
calculating the fitness corresponding to each feature in the initial feature subset by adopting a preset fitness formula;
the preset fitness formula is as follows:
wherein fit is the fitness, err is the error rate, dimension is the feature number corresponding to the initial feature subset, and D is the total feature number corresponding to the target historical feature data set;
adopting all the characteristics of which the fitness does not meet the corresponding preset fitness threshold value to construct a characteristic deletion set;
and removing the features corresponding to the feature deletion set in the initial feature subset, generating an intermediate feature subset and counting the iteration times in real time.
4. The method for extracting features of a boiler soot blower according to claim 1, wherein said step of calculating the prediction accuracy corresponding to said intermediate feature subset using a random forest algorithm comprises:
encoding the intermediate feature subset to generate a feature vector subset;
respectively selecting a plurality of characteristic vectors in the characteristic vector sub-sets to construct a plurality of training sets and test sets;
performing model construction by adopting all the training sets and the test sets to generate a random forest model;
And inputting the feature vector subset into the random forest model for data evaluation, and generating the prediction precision corresponding to the intermediate feature subset.
5. The method of feature extraction for a boiler soot blower of claim 4, wherein said step of modeling using all of said training set and said test set to generate a random forest model comprises:
respectively carrying out decision tree training on each training set by adopting a recursion splitting method to generate a decision tree corresponding to the training set;
testing the decision tree and the corresponding test set respectively to generate prediction data corresponding to the training set;
judging whether all the predicted data meet a preset accurate threshold value or not;
if yes, constructing a random forest model by adopting all the decision trees;
if not, the step of respectively selecting a plurality of characteristic vectors in the characteristic vector sub-sets and constructing a plurality of training sets and test sets is carried out in a jumping manner until all the predicted data meet a preset accurate threshold value.
6. The method for extracting features of a boiler soot blower according to claim 1, wherein said preset threshold values include preset scoring criteria, iteration number threshold values and fitness threshold values; the step of comparing the prediction precision and the iteration times with corresponding preset thresholds respectively and determining the feature set corresponding to the boiler soot blower by combining the feature deletion set comprises the following steps:
Judging whether the prediction precision meets the preset scoring standard or not;
if yes, the intermediate feature subset is used as a target feature subset;
if not, constructing a target feature subset by adopting the intermediate feature subset and the feature deletion set;
judging whether the iteration times meet the iteration times threshold;
if yes, the target feature subset is used as a feature set corresponding to the boiler soot blower;
if not, determining a feature set corresponding to the boiler soot blower according to the subset fitness corresponding to the target feature subset and the fitness threshold.
7. The method of feature extraction for a boiler soot blower of claim 6, wherein said step of determining a feature set for said boiler soot blower based on a subset fitness for said target feature subset and said fitness threshold comprises:
judging whether the subset fitness corresponding to the target feature subset meets the fitness threshold value or not;
if yes, the target feature subset is used as a feature set corresponding to the boiler soot blower;
if not, updating the speed and the position of each target feature in the target feature subset to generate a particle feature set;
And taking the particle characteristic set as the target historical characteristic data set, jumping to execute the step of carrying out characteristic extraction on the target historical characteristic data set by adopting a particle swarm algorithm, generating a characteristic deleting set, a middle characteristic subset and counting the iteration times in real time.
8. A feature extraction system for a boiler sootblower, comprising:
the initial historical characteristic data set acquisition module is used for acquiring an initial historical characteristic data set of the boiler soot blower;
the target historical characteristic data set generation module is used for carrying out characteristic dimension reduction on the initial historical characteristic data set by adopting a Pearson correlation coefficient to generate a target historical characteristic data set;
the feature deleting set, the intermediate feature subset and the iteration number generating module are used for carrying out feature extraction on the target historical feature data set by adopting a particle swarm algorithm, generating a feature deleting set and an intermediate feature subset, and counting the iteration number in real time;
the prediction precision calculation module is used for calculating the prediction precision corresponding to the intermediate feature subset by adopting a random forest algorithm;
and the feature set determining module is used for comparing the prediction precision and the iteration times with corresponding preset thresholds respectively and determining a feature set corresponding to the boiler soot blower by combining the feature deletion set.
9. An electronic device comprising a memory and a processor, wherein the memory stores a computer program that, when executed by the processor, causes the processor to perform the steps of the method of feature extraction of a boiler soot blower as claimed in any one of claims 1-7.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed, implements the method of feature extraction of a boiler soot blower according to any one of claims 1-7.
CN202310418102.4A 2023-04-18 2023-04-18 Feature extraction method, system, equipment and medium of boiler soot blower Pending CN116561554A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310418102.4A CN116561554A (en) 2023-04-18 2023-04-18 Feature extraction method, system, equipment and medium of boiler soot blower

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310418102.4A CN116561554A (en) 2023-04-18 2023-04-18 Feature extraction method, system, equipment and medium of boiler soot blower

Publications (1)

Publication Number Publication Date
CN116561554A true CN116561554A (en) 2023-08-08

Family

ID=87490770

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310418102.4A Pending CN116561554A (en) 2023-04-18 2023-04-18 Feature extraction method, system, equipment and medium of boiler soot blower

Country Status (1)

Country Link
CN (1) CN116561554A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428790A (en) * 2020-03-26 2020-07-17 沈阳理工大学 Double-accuracy weighted random forest algorithm based on particle swarm optimization
CN113239321A (en) * 2021-05-28 2021-08-10 哈尔滨理工大学 Feature selection method based on filtering and packaging type hierarchy progression
CN113869332A (en) * 2021-10-18 2021-12-31 国网浙江省电力有限公司信息通信分公司 Feature selection method, device, storage medium and equipment
CN113962454A (en) * 2021-10-18 2022-01-21 长江勘测规划设计研究有限责任公司 LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization
CN115688097A (en) * 2022-11-09 2023-02-03 东北大学 Industrial control system intrusion detection method based on improved genetic algorithm feature selection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428790A (en) * 2020-03-26 2020-07-17 沈阳理工大学 Double-accuracy weighted random forest algorithm based on particle swarm optimization
CN113239321A (en) * 2021-05-28 2021-08-10 哈尔滨理工大学 Feature selection method based on filtering and packaging type hierarchy progression
CN113869332A (en) * 2021-10-18 2021-12-31 国网浙江省电力有限公司信息通信分公司 Feature selection method, device, storage medium and equipment
CN113962454A (en) * 2021-10-18 2022-01-21 长江勘测规划设计研究有限责任公司 LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization
CN115688097A (en) * 2022-11-09 2023-02-03 东北大学 Industrial control system intrusion detection method based on improved genetic algorithm feature selection

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
刘艺;曹建军;刁兴春;周星;: "特征选择稳定性研究综述", 软件学报, no. 09, 13 November 2017 (2017-11-13) *
吴俊 等: "参数自动优化的特征选择融合算法", 《计算机系统应用》, 3 July 2020 (2020-07-03), pages 147 - 148 *
吴俊;柯飂挺;任佳;: "参数自动优化的特征选择融合算法", 计算机系统应用, no. 07, 15 July 2020 (2020-07-15) *
杨明旭 等: "基于量子粒子群和随机森林的特征选择方法", 《福建电脑》, 31 December 2010 (2010-12-31), pages 94 - 95 *

Similar Documents

Publication Publication Date Title
Jadav et al. Optimizing weights of artificial neural networks using genetic algorithms
CN110968272B (en) Time sequence prediction-based method and system for optimizing storage performance of mass small files
CN112687349A (en) Construction method of model for reducing octane number loss
CN115906675B (en) Well position and injection and production parameter joint optimization method based on time sequence multi-target prediction model
Zhou et al. NIG-AP: a new method for automated penetration testing
Abdalla et al. Optimizing the multilayer feed-forward artificial neural networks architecture and training parameters using genetic algorithm
CN116205164A (en) Multi-agent injection and production optimization method based on self-adaptive basis function selection
CN109492816B (en) Coal and gas outburst dynamic prediction method based on hybrid intelligence
CN115345222A (en) Fault classification method based on TimeGAN model
CN116561554A (en) Feature extraction method, system, equipment and medium of boiler soot blower
Lu et al. Counting crowd by weighing counts: A sequential decision-making perspective
CN116227952B (en) Method and device for selecting attack target defense strategy under key information deficiency
Saif-ul-Allah et al. Convolutional neural network approach for reduction of nitrogen oxides emissions from pulverized coal-fired boiler in a power plant for sustainable environment
CN114970353A (en) MSWI process dioxin emission soft measurement method based on missing data filling
CN113962295A (en) Weapon equipment system efficiency evaluation method, system and device
CN111369124A (en) Image aesthetic prediction method based on self-generation global features and attention
Khotimah et al. Initial center weight self organizing map using genetic algorithm
Smith et al. Recurrent neural network ensembles for convergence prediction in surrogate-assisted evolutionary optimization
Qu et al. Learning Diverse and Effective Policies with Non-Markovian Rewards
Ali et al. Recent Trends in Neural Architecture Search Systems
CN115302507A (en) Intelligent decision-making method for disassembly process of industrial robot driven by digital twin
CN108446459B (en) Fuzzy semantic reasoning-based coking process heat consumption influence factor optimization method
Dakiche BSODCS: Bee Swarm Optimization for Detecting Community Structure
CN114462476A (en) Time series feature generation method and device based on evolutionary algorithm and LSTM
Zhang et al. Learnable Gated Graph Convolutional Residual Network for Traffic Prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination