CN114464266A - Pulverized coal boiler NOx emission prediction method and device based on improved SSA-GPR - Google Patents

Pulverized coal boiler NOx emission prediction method and device based on improved SSA-GPR Download PDF

Info

Publication number
CN114464266A
CN114464266A CN202210102566.XA CN202210102566A CN114464266A CN 114464266 A CN114464266 A CN 114464266A CN 202210102566 A CN202210102566 A CN 202210102566A CN 114464266 A CN114464266 A CN 114464266A
Authority
CN
China
Prior art keywords
data
value
formula
finder
boiler
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210102566.XA
Other languages
Chinese (zh)
Other versions
CN114464266B (en
Inventor
周欣欣
赵政
李茂源
薛青常
张丹楠
郭树强
霍光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeast Electric Power University
Original Assignee
Northeast Dianli University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Dianli University filed Critical Northeast Dianli University
Priority to CN202210102566.XA priority Critical patent/CN114464266B/en
Publication of CN114464266A publication Critical patent/CN114464266A/en
Application granted granted Critical
Publication of CN114464266B publication Critical patent/CN114464266B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C10/00Computational theoretical chemistry, i.e. ICT specially adapted for theoretical aspects of quantum chemistry, molecular mechanics, molecular dynamics or the like
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/08Computing arrangements based on specific mathematical models using chaos models or non-linear system models
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention discloses a method and a device for predicting the NOx (nitrogen oxide) discharge amount of a pulverized coal boiler based on improved SSA-GPR (Selective catalytic reduction-GPR), wherein the prediction method comprises the following steps: (1) collecting historical characteristic parameter data of the pulverized coal fired boiler; (2) carrying out normalization processing on the data; (3) performing dimensionality reduction on the data; (4) dividing a data set into a training set and a verification set; (5) optimizing the hyper-parameters of the Gaussian process by adopting an improved SSA algorithm to obtain the optimized Gaussian process; (6) inputting the training set into a model for model training; (7) and inputting the verification set into a model for model verification to obtain a final coal powder boiler NOx emission prediction model based on the improved SSA-GPR. The method and the device for predicting the NOx emission effectively improve the emission prediction precision of the NOx of the pulverized coal boiler, provide technical support for reducing the emission of the NOx in the actual operation process of the follow-up power station boiler, and have important practical engineering significance for saving energy and reducing emission of a power station and responding to national green environmental protection policies.

Description

Pulverized coal boiler NOx emission prediction method and device based on improved SSA-GPR
Technical Field
The invention relates to the field of boiler control, in particular to a method for predicting NOx (nitrogen oxide) discharge of a pulverized coal boiler.
Background
With the rapid development of economy, the coal consumption of thermal power plants has increased year by year, with the concomitant increase in NOx emissions. NOx is prone to form acid rain and photochemical smog, which not only pollutes the environment but also causes certain economic loss. Meanwhile, the boiler is one of three core devices of a power plant and is a main place for generating harmful gas, so that the emission of nitrogen oxides of the boiler is predicted, and subsequent optimization control is performed, so that the environmental pollution of the boiler can be greatly reduced in the combustion operation process.
The traditional prediction method of the emission of nitrogen oxides selects operation characteristics through artificial experience to carry out modeling, and the modeling method is based on combustion mechanism knowledge and engineering experience. However, for a boiler system involving multiple disciplines such as combustion, heat transfer, fluid and the like, the combustion process is complex, the parameters are many and are mutually coupled, the establishment of a mechanism model is very complex, a high-precision predicted value is difficult to obtain, and great influence is caused on subsequent optimization control.
In recent years, the development pace of artificial intelligence technology is accelerated, and therefore, the prediction technology is promoted to enter a brand-new stage. The technology for predicting the discharge amount of the nitrogen oxide based on artificial intelligence does not need an accurate physical model between boiler characteristics and the discharge amount of the nitrogen oxide, and can well solve the problem of a complex combustion process of a boiler. Therefore, the boiler nitrogen oxide emission prediction method based on artificial intelligence is further and deeply researched, the prediction precision is improved, and the method has important practical engineering significance for subsequent optimization control and reduction of environmental pollution.
Disclosure of Invention
The invention provides a boiler NOx (nitrogen oxide) emission prediction method and device based on improved SSA-GPR, which are characterized by selecting characteristics based on RReliefF and Pearson correlation analysis, and combining an improved Sparrow Search Algorithm (SSA) and a Gaussian Process (WGP), so that the nitrogen oxide emission prediction precision is improved, and the defects of the Gaussian Process in the regression prediction field are overcome.
In order to realize the purpose, the invention provides the following technical scheme:
a boiler nitrogen oxide emission prediction method based on improved SSA-GPR comprises the following specific steps:
step 1000: acquiring characteristic parameter historical data and NOx emission of a coal-fired boiler from a power plant DCS to form a first data set, wherein the first data set is a two-dimensional matrix X formed by n rows and m columns of data, n sample data collected by rows in the matrix are listed as m-1 characteristics related to each sample and the NOx emission, and the n multiplied by m data form the matrix X:
Figure BDA0003492931110000021
wherein x isij(i 1,2, …, n, j 1,2, …, m) is the value of the j-th feature of the ith sample;
step 2000: and normalizing the data in the first data set in the step 1000 by a normalization method to form a second data set D, wherein the data normalization method adopts a Min-Max normalization method, and the normalization formula is as follows:
Figure BDA0003492931110000022
in formula (1), MaxValue represents the maximum value of sample data; MinValue represents the minimum value of the sample data; x represents sample raw data; y represents the data after normalization;
step 3000: performing dimensionality reduction on the data in the second data set in the step 2000 by using an RReliefF algorithm and Pearson correlation analysis to form a third data set, and further comprising the steps 3100-3700:
step 3100: calculating relative distances between the respective feature parameters and NOx by an RReliefF algorithm with respect to the second data set described in step 2000, and weighting each feature by the relative distances, further comprising steps 3110-3130:
step 3110: according to the formula (2), calculating the probability that the characteristic values A in the similar samples are different:
PdifAsample of P (difvalue (a) i (2))
In the formula (2), PdifARepresenting the probability that the characteristic values A in the similar samples are different, calculating the distance between the two samples by using a dif function and finding the nearest adjacent sample, wherein value (A) represents the characteristic A, and the nearest sample represents that the two samples are the closest in relative distance in a sample space;
step 3120: according to the formula (3), calculating the probability that the NOx emission amount in the similar samples is different:
PdifCp (approximate sample of difNOx) (3)
In formula (3), PdifCRepresenting the probability of different NOx emission in similar samples;
step 3130: and (3) obtaining formula (4) according to the conditional probability to calculate the weight of each characteristic parameter of the boiler:
Figure BDA0003492931110000031
in formula (4), W [ A ]]Representing the weight, P, of each characteristic parameter of the boilerdifC|difARepresenting the probability of different NOx emission in similar samples with different characteristic values;
step 3200: the Pearson correlation coefficient between the features is calculated according to equation (5):
Figure BDA0003492931110000032
in the formula (5), i is the ith column characteristic,j is the characteristics of the jth column,
Figure BDA0003492931110000033
is the mean of the samples of the feature i,
Figure BDA0003492931110000034
the average value of the samples of the characteristic j is obtained, and n is the number of the samples;
step 3300: based on the Bootstrap random sampling idea, K sample subsets are extracted from the second data set D in step 2000
Figure BDA0003492931110000035
Step 3400: using RReliefF algorithm pair
Figure BDA0003492931110000036
The features of (1) are sorted according to weight, and features smaller than a first threshold are deleted to obtain K different subsets
Figure BDA0003492931110000037
Step 3500: to pair
Figure BDA0003492931110000038
Using Pearson correlation analysis to calculate Pearson correlation coefficients between every two characteristics, and taking an absolute value;
step 3600: according to a second threshold value which is set in advance, if the second threshold value is larger than the second threshold value, deleting the next characteristic in the characteristic sequence of the step 3500 to obtain K training subsets
Figure BDA0003492931110000039
By this step, redundant data is removed;
step 3700: summarizing the obtained results, outputting the sequencing result with the most occurrence times, obtaining a plurality of characteristics which have the greatest influence on the NOx emission, and obtaining a third data set;
step 4000: dividing the third data set of step 3000 into a training set and a validation set;
step 5000: the improved SSA algorithm is adopted to optimize the hyperparameter of the Gaussian process to obtain the optimized Gaussian process, and the method further comprises the following steps of 5100-5800:
step 5100: selecting the training set and the verification set of the step 4000 as a training set and a test set of the Gaussian process model;
step 5200: determining the hyper-parameter l,
Figure BDA0003492931110000041
Wherein the hyper-parameter l represents the characteristic length scale, the hyper-parameter
Figure BDA0003492931110000042
Representing signal variance, and setting the population number N, the finder number proportion PD and the maximum iteration number iterm of the improved SSA;
step 5300: initializing the population by adopting an infinite folding Sin chaos to increase the diversity of the population, and generating N solutions according to a formula (6), wherein each solution vector corresponds to a two-dimensional vector
Figure BDA0003492931110000043
Xn=sin(δ/xn),n=0,1...,N (6)
In the formula (6), xnDenotes the nth initial individual, XnRepresents the individuals after the nth initial individual chaos mapping, and delta epsilon (0, 4)];-1≤xnX is less than or equal to 1n≠0;
Step 5400: selecting the finder with the excellent fitness value according to the ratio PD of the number of the finders, and updating the individual position of the finder according to a formula (7):
Figure BDA0003492931110000044
in the formula (7), the first and second groups,
Figure BDA0003492931110000045
represents the t-th generationThe ith finder has position information in the jth dimension of the tth generation, t represents the current iteration number, iterm represents the maximum iteration number, and alpha belongs to (0, 1)]Is a random number, Q is a random number following a normal distribution, R2∈(0,1]Represents an early warning value, ST ∈ [0.5,1 [ ]]Represents a security value;
step 5500: according to the formula (8) (9), the sensitivity-pheromone matching mode is adopted to improve the mode that the follower selects the finder, and the mode that the follower selects the finder further comprises the steps 5510-5530:
step 5510: calculating the pheromone value of the ith finder by the pheromone calculation method shown in formula (8), wherein the pheromone is a value which has a proportional relation with the adaptability value of the finder and is used for marking the finder;
Figure BDA0003492931110000046
in formula (8), p (i) is the pheromone of the ith finder, i represents the ith finder, f (i) is the current fitness value of the ith finder, fminIs the finder fitness value with the smallest value, fmaxIs the finder fitness value with the largest value;
step 5520: calculating the sensitivity of the jth follower, wherein each follower has sensitivity to pheromones, the sensitivity is different in the optimization process, the selection range is expanded by adopting a sensitivity-pheromone matching mode, and the sensitivity calculation is shown as a formula (9):
S(j)=Smin+△Sj (9)
in formula (9), Δ Sj=(Smax-Smin)·Rand(0,1),Smax=P(i)max,Smin=P(i)min,P(i)maxIs the pheromone with the largest current value, P (i)minIs the pheromone with the smallest current value, SmaxIs the sensitivity, S, at which the current value is the maximumminIs the sensitivity at which the current value is the minimum;
step 5530: finding the finder i matched with the sensitivity of the jth follower: randomly finding out j, and satisfying P (i) ═ S (j);
step 5600: the follower individual positions are updated according to the constraints set forth for the followers in step 5500 using equation (10):
Figure BDA0003492931110000051
in the formula (10), the first and second groups,
Figure BDA0003492931110000052
indicating the position information of the ith follower in the jth dimension of the tth generation, t representing the current iteration number,
Figure BDA0003492931110000053
is the optimal position occupied by the discoverer of the t +1 generation,
Figure BDA0003492931110000054
then representing the current global worst finder position, d is the dimension, Q is a random number which meets the standard normal distribution, and a belongs to (-1, 1);
step 5700: updating the boundary individual positions according to equation (11):
Figure BDA0003492931110000055
in the formula (11), the reaction mixture is,
Figure BDA0003492931110000056
representing the position information of the ith boundary individual of the tth generation in the jth dimension, wherein t represents the current iteration number,
Figure BDA0003492931110000057
is the current global optimum position; β is a random number that follows a normal distribution; k ∈ [ -1,1]Is a random number, fiIs the fitness value of the current sparrow individual; f. ofgAnd fwThe current global best and worst fitness values, respectively; ε is the minimum constant to avoid a denominator of zero; k represents that the sparrows move in the same directionTime is also a step length control parameter;
step 5800: judging whether the current value reaches a good enough fitness value or reaches the maximum iteration number, if so, terminating the program, and outputting the optimal group of solutions
Figure BDA0003492931110000061
Thereby obtaining the optimized Gaussian process hyper-parameter; otherwise, adding 1 to the iteration times, and jumping to the step 5400 to continue searching;
step 6000: inputting the training set of the step 4000 into the optimized Gaussian process model of the step 5000 for model training;
step 7000: inputting the verification set of the step 4000 into the trained prediction model of the step 6000 for model verification, and taking the formula (12) that the average absolute error does not exceed e-10As a validation standard:
Figure BDA0003492931110000062
in the formula (12), MAE is the average absolute error, and n is the number of samples in the verification set; y isiIs an actual value;
Figure BDA0003492931110000063
is a predicted value;
and when the minimum error or the maximum training times of the Gaussian process is reached, obtaining a final coal dust boiler NOx emission prediction model based on the improved SSA-GPR.
An apparatus based on an improved method for predicting NOx emission of a SSA-GPR pulverized coal boiler, which is characterized by comprising:
a data acquisition module: acquiring characteristic parameter historical data and NOx emission of the coal-fired boiler from a power plant DCS through step 1000 to obtain a first data set;
a data processing module: through step 2000, step 3000, performing data preprocessing on the first data set, specifically including obtaining the second data set through data normalization processing, and obtaining the third data set after dimension reduction processing;
a training module: the method is used for establishing a prediction model based on the NOx emission of the improved SSA-GPR boiler, training a training set in the third data set through step 4000 to be based on the improved SSA-GPR model, and verifying the accuracy of the prediction model based on the improved SSA-GPR through a verification set of step 4000;
a prediction module: preprocessing real-time detection data in the operation of the boiler to obtain a data sample, inputting the data sample into a trained NOx emission prediction model based on the improved SSA-GPR, and finally obtaining a NOx emission prediction result.
Compared with the prior art, the invention has the beneficial effects that:
the invention adopts a sparrow search algorithm based on an Sin infinite chaos strategy and an pheromone sensitivity strategy to optimize the hyper-parameters of the GPR, solves the problems of low convergence rate and low prediction precision caused by the weak capability of globally searching the optimal hyper-parameters of the GPR, provides a new method based on artificial intelligence for predicting the NOx emission of boiler combustion, and reduces environmental pollution and economic loss.
Drawings
FIG. 1 is a flow chart of a method for predicting NOx (oxides of nitrogen) emissions from a pulverized coal fired boiler based on modified SSA-GPR;
Detailed Description
In order that the above aspects of the present invention may be more clearly understood, the present invention will now be described in further detail with reference to the accompanying drawings. It should be noted that the specific implementation described herein is only for explaining the present application and is not used to limit the present application.
FIG. 1 is a flow chart of a method for predicting NOx (nitrogen oxide) emission of an improved SSA-GPR pulverized coal boiler, which comprises the following specific steps:
step 1000: acquiring characteristic parameter historical data and NOx emission of a coal-fired boiler from a power plant DCS to form a first data set, wherein the first data set is a two-dimensional matrix X formed by n rows and m columns of data, n sample data collected by rows in the matrix are listed as m-1 characteristics related to each sample and the NOx emission, and the n multiplied by m data form the matrix X:
Figure BDA0003492931110000071
wherein x isij(i 1,2, …, n, j 1,2, …, m) is the value of the j-th feature of the ith sample;
step 2000: and normalizing the data in the first data set in the step 1000 by a normalization method to form a second data set D, wherein the data normalization method adopts a Min-Max normalization method, and the normalization formula is as follows:
Figure BDA0003492931110000072
in formula (1), MaxValue represents the maximum value of sample data; MinValue represents the minimum value of the sample data; x represents sample raw data; y represents the data after normalization;
step 3000: performing dimensionality reduction on the data in the second data set in the step 2000 by using an RReliefF algorithm and Pearson correlation analysis to form a third data set, and further comprising the steps 3100-3700:
step 3100: calculating relative distances between the respective feature parameters and NOx by an RReliefF algorithm with respect to the second data set described in step 2000, and weighting each feature by the relative distances, further comprising steps 3110-3130:
step 3110: according to the formula (2), calculating the probability that the characteristic values A in the similar samples are different:
PdifAsample of P (difvalue (a) i (2))
In the formula (2), PdifAIndicating the probability of the eigenvalues a being different in close samples, the dif function is used to calculate the distance between two samples and find the nearest neighbor, value (a) indicatesFeature a, a sample that is close indicates that the two samples are closest in relative distance in sample space;
step 3120: according to the formula (3), calculating the probability that the NOx emission amount in the similar samples is different:
PdifCp (approximate sample of difNOx) (3)
In formula (3), PdifCRepresenting the probability of different NOx emission in similar samples;
step 3130: and (3) obtaining formula (4) according to the conditional probability to calculate the weight of each characteristic parameter of the boiler:
Figure BDA0003492931110000081
in formula (4), W [ A ]]Representing the weight, P, of each characteristic parameter of the boilerdifC|difARepresenting the probability of different NOx emission amounts in similar samples with different characteristic values;
step 3200: the Pearson correlation coefficient between the features is calculated according to equation (5):
Figure BDA0003492931110000082
in formula (5), i is the ith column characteristic, j is the jth column characteristic,
Figure BDA0003492931110000083
is the mean of the samples of the feature i,
Figure BDA0003492931110000084
is the sample mean value of the characteristic j, and n is the number of samples;
step 3300: based on the Bootstrap random sampling idea, K sample subsets are extracted from the second data set D in step 2000
Figure BDA0003492931110000091
Step 3400: using RReliefF algorithm pair
Figure BDA0003492931110000092
The features of (1) are sorted according to weight, and features smaller than a first threshold are deleted to obtain K different subsets
Figure BDA0003492931110000093
Step 3500: to pair
Figure BDA0003492931110000094
Using Pearson correlation analysis to calculate Pearson correlation coefficients between every two characteristics, and taking an absolute value;
step 3600: according to a second threshold value which is set in advance, if the second threshold value is larger than the second threshold value, deleting the next characteristic in the characteristic sequence of the step 3500 to obtain K training subsets
Figure BDA0003492931110000095
By this step, redundant data is removed;
step 3700: summarizing the obtained results, outputting the sequencing result with the most occurrence times, obtaining a plurality of characteristics which have the greatest influence on the NOx emission, and obtaining a third data set;
step 4000: dividing the third data set of step 3000 into a training set and a validation set;
step 5000: the improved SSA algorithm is adopted to optimize the hyperparameter of the Gaussian process to obtain the optimized Gaussian process, and the method further comprises the following steps of 5100-5800:
step 5100: selecting the training set and the verification set of the step 4000 as a training set and a test set of the Gaussian process model;
step 5200: determining the hyper-parameter l,
Figure BDA0003492931110000096
Wherein the hyper-parameter l represents the characteristic length scale, the hyper-parameter
Figure BDA0003492931110000097
Representing signalsVariance, setting the population quantity N of the improved SSA, the discoverer quantity proportion PD and the maximum iteration number iterm;
step 5300: the population is initialized by infinite folding Sin chaos to increase population diversity, N solutions are generated according to formula (6), and each solution vector corresponds to a two-dimensional vector
Figure BDA0003492931110000098
Xn=sin(δ/xn),n=0,1...,N (6)
In the formula (6), xnDenotes the nth initial individual, XnRepresents the individuals after the nth initial individual chaos mapping, and delta epsilon (0, 4)];-1≤xnX is less than or equal to 1n≠0;
Step 5400: selecting the finder with the excellent fitness value according to the ratio PD of the number of the finders, and updating the individual position of the finder according to a formula (7):
Figure BDA0003492931110000101
in the formula (7), the first and second groups,
Figure BDA0003492931110000102
representing the position information of the ith finder in the jth dimension of the tth generation, t representing the current iteration number, iterm representing the maximum iteration number, and alpha being (0, 1)]Is a random number, Q is a random number following a normal distribution, R2∈(0,1]Represents an early warning value, ST ∈ [0.5,1 [ ]]Represents a security value;
step 5500: according to the formula (8) (9), the sensitivity-pheromone matching mode is adopted to improve the mode that the follower selects the finder, and the mode that the follower selects the finder further comprises the steps 5510-5530:
step 5510: calculating the pheromone value of the ith finder by the pheromone calculation method shown in formula (8), wherein the pheromone is a value which has a proportional relation with the adaptability value of the finder and is used for marking the finder;
Figure BDA0003492931110000103
in formula (8), p (i) is the pheromone of the ith finder, i represents the ith finder, f (i) is the current fitness value of the ith finder, fminIs the finder fitness value with the smallest value, fmaxIs the finder fitness value with the largest value;
step 5520: calculating the sensitivity of the jth follower, wherein each follower has sensitivity to pheromones, the sensitivity is different in the optimization process, the selection range is expanded by adopting a sensitivity-pheromone matching mode, and the sensitivity calculation is shown as a formula (9):
S(j)=Smin+△Sj (9)
in formula (9), Δ Sj=(Smax-Smin)·Rand(0,1),Smax=P(i)max,Smin=P(i)min,P(i)maxIs the pheromone with the largest current value, P (i)minIs the pheromone with the smallest current value, SmaxIs the sensitivity, S, at which the current value is the maximumminIs the sensitivity at which the current value is the minimum;
step 5530: finding the finder i matched with the sensitivity of the jth follower: randomly finding out j, and satisfying P (i) ═ S (j);
step 5600: the follower individual positions are updated according to the constraints set forth for the followers in step 5500 using equation (10):
Figure BDA0003492931110000111
in the formula (10), the first and second groups,
Figure BDA0003492931110000112
indicating the position information of the ith follower in the jth dimension of the tth generation, t representing the current iteration number,
Figure BDA0003492931110000113
is the optimal position occupied by the discoverer of the t +1 generation,
Figure BDA0003492931110000114
then represents the current global worst finder position, d is the dimension, Q is a random number satisfying the standard normal distribution, a ∈ (-1, 1);
step 5700: updating the boundary individual positions according to equation (11):
Figure BDA0003492931110000115
in the formula (11), the reaction mixture,
Figure BDA0003492931110000116
representing the position information of the ith boundary individual of the tth generation in the jth dimension, wherein t represents the current iteration number,
Figure BDA0003492931110000117
is the current global optimum position; β is a random number that follows a normal distribution; k ∈ [ -1,1]Is a random number, fiIs the fitness value of the current sparrow individual; f. ofgAnd fwThe current global best and worst fitness values, respectively; ε is the minimum constant to avoid a denominator of zero; k represents the moving direction of the sparrows and is also a step length control parameter;
step 5800: judging whether the current value reaches a good enough fitness value or reaches the maximum iteration number, if so, terminating the program, and outputting the optimal group of solutions
Figure BDA0003492931110000118
Thereby obtaining the optimized Gaussian process hyper-parameter; otherwise, adding 1 to the iteration times, and jumping to the step 5400 to continue searching;
step 6000: inputting the training set of the step 4000 into the optimized Gaussian process model of the step 5000 for model training;
step 7000: inputting the verification set of step 4000 into the trained verification set of step 6000Model verification is carried out on the prediction model, and the average absolute error of the prediction model does not exceed e according to the formula (12)-10As a validation standard:
Figure BDA0003492931110000119
in the formula (12), MAE is the average absolute error, and n is the number of samples in the verification set; y isiIs an actual value;
Figure BDA00034929311100001110
is a predicted value;
and when the minimum error or the maximum training times of the Gaussian process is reached, obtaining a final coal dust boiler NOx emission prediction model based on the improved SSA-GPR.
An apparatus based on an improved method for predicting NOx emission of a SSA-GPR pulverized coal boiler, which is characterized by comprising:
a data acquisition module: acquiring characteristic parameter historical data and NOx emission of the coal-fired boiler from a power plant DCS through step 1000 to obtain a first data set;
a data processing module: through step 2000, step 3000, performing data preprocessing on the first data set, specifically including obtaining the second data set through data normalization processing, and obtaining the third data set after dimension reduction processing;
a training module: the method is used for establishing a prediction model based on the NOx emission of the improved SSA-GPR boiler, training a training set in the third data set through step 4000 to be based on the improved SSA-GPR model, and verifying the accuracy of the prediction model based on the improved SSA-GPR through a verification set of step 4000;
a prediction module: preprocessing real-time detection data in the operation of the boiler to obtain a data sample, inputting the data sample into a trained NOx emission prediction model based on the improved SSA-GPR, and finally obtaining a NOx emission prediction result.
The above description is only an example of the present invention, and does not limit the scope of the present invention, and it is obvious to those skilled in the art that various modifications and variations can be made in the present invention. Any modification, equivalent replacement, or improvement made without departing from the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (2)

1. A pulverized coal boiler NOx nitrogen oxide emission prediction method based on improved SSA-GPR is characterized by comprising the following steps:
step 1000: acquiring characteristic parameter historical data and NOx emission of a coal-fired boiler from a power plant DCS to form a first data set, wherein the first data set is a two-dimensional matrix X formed by n rows and m columns of data, n sample data collected by rows in the matrix are listed as m-1 characteristics related to each sample and the NOx emission, and the n multiplied by m data form the matrix X:
Figure FDA0003492931100000011
wherein x isij(i 1,2, …, n, j 1,2, …, m) is the value of the j-th feature of the ith sample;
step 2000: and normalizing the data in the first data set in the step 1000 by a normalization method to form a second data set D, wherein the data normalization method adopts a Min-Max normalization method, and the normalization formula is as follows:
Figure FDA0003492931100000012
in formula (1), MaxValue represents the maximum value of sample data; MinValue represents the minimum value of sample data; x represents sample raw data; y represents the data after normalization;
step 3000: performing dimensionality reduction on the data in the second data set in the step 2000 by using an RReliefF algorithm and Pearson correlation analysis to form a third data set, and further comprising the steps 3100-3700:
step 3100: calculating relative distances between the respective feature parameters and NOx by an RReliefF algorithm with respect to the second data set described in step 2000, and weighting each feature by the relative distances, further comprising steps 3110-3130:
step 3110: according to the formula (2), calculating the probability that the characteristic values A in the similar samples are different:
PdifAp (approximate sample of difvalue (a)) | (2)
In the formula (2), PdifARepresenting the probability that the characteristic values A in the similar samples are different, calculating the distance between the two samples by using a dif function and finding the nearest adjacent sample, wherein value (A) represents the characteristic A, and the nearest sample represents that the two samples are the closest in relative distance in a sample space;
step 3120: according to the formula (3), calculating the probability that the NOx emission amount in the similar samples is different:
PdifCp (approximate sample of difNOx) (3)
In formula (3), PdifCRepresenting the probability of different NOx emission in similar samples;
step 3130: and (3) obtaining formula (4) according to the conditional probability to calculate the weight of each characteristic parameter of the boiler:
Figure FDA0003492931100000021
in formula (4), W [ A ]]Representing the weight, P, of each characteristic parameter of the boilerdifC|difARepresenting the probability of different NOx emission amounts in similar samples with different characteristic values;
step 3200: the Pearson correlation coefficient between the features is calculated according to equation (5):
Figure FDA0003492931100000022
in formula (5), i is the ith column characteristic, j is the jth column characteristic,
Figure FDA0003492931100000023
is the mean of the samples of the feature i,
Figure FDA0003492931100000024
is the sample mean value of the characteristic j, and n is the number of samples;
step 3300: based on the Bootstrap random sampling idea, K sample subsets are extracted from the second data set D in step 2000
Figure FDA0003492931100000025
Step 3400: using RReliefF algorithm pair
Figure FDA0003492931100000026
The features of (3) are sorted according to weight, and features smaller than a first threshold are deleted to obtain K different subsets
Figure FDA0003492931100000027
Step 3500: for is to
Figure FDA0003492931100000028
Using Pearson correlation analysis to calculate Pearson correlation coefficients between every two characteristics, and taking an absolute value;
step 3600: according to a second threshold value which is set in advance, if the second threshold value is larger than the second threshold value, deleting the next characteristic in the characteristic sequence of the step 3500 to obtain K training subsets
Figure FDA0003492931100000029
By this step, redundant data is removed;
step 3700: summarizing the obtained results, outputting the sequencing result with the most occurrence times, obtaining a plurality of characteristics which have the greatest influence on the NOx emission, and obtaining a third data set;
step 4000: dividing the third data set of step 3000 into a training set and a validation set;
step 5000: the improved SSA algorithm is adopted to optimize the hyperparameter of the Gaussian process to obtain the optimized Gaussian process, and the method further comprises the following steps of 5100-5800:
step 5100: selecting the training set and the verification set of the step 4000 as a training set and a test set of the Gaussian process model;
step 5200: determining the hyper-parameter l,
Figure FDA0003492931100000031
Wherein the hyper-parameter l represents the characteristic length scale, the hyper-parameter
Figure FDA0003492931100000032
Representing signal variance, and setting the population number N, the finder number proportion PD and the maximum iteration number iterm of the improved SSA;
step 5300: the population is initialized by infinite folding Sin chaos to increase population diversity, N solutions are generated according to formula (6), and each solution vector corresponds to a two-dimensional vector
Figure FDA0003492931100000033
Xn=sin(δ/xn),n=0,1...,N (6)
In the formula (6), xnDenotes the nth initial individual, XnRepresents the individuals after the nth initial individual chaos mapping, and delta epsilon (0, 4)];-1≤xnX is less than or equal to 1n≠0;
Step 5400: selecting the finder with the excellent fitness value according to the ratio PD of the number of the finders, and updating the individual position of the finder according to a formula (7):
Figure FDA0003492931100000034
in the formula (7), the first and second groups,
Figure FDA0003492931100000035
representing the position information of the ith finder in the jth dimension of the tth generation, t representing the current iteration number, iterm representing the maximum iteration number, and alpha being (0, 1)]Is a random number, Q is a random number following a normal distribution, R2∈(0,1]Represents an early warning value, ST ∈ [0.5,1 [ ]]Represents a security value;
step 5500: according to the formula (8) (9), the sensitivity-pheromone matching mode is adopted to improve the mode that the follower selects the finder, and the mode that the follower selects the finder further comprises the steps 5510-5530:
step 5510: calculating the pheromone value of the ith finder by the pheromone calculation method shown in formula (8), wherein the pheromone is a value which has a proportional relation with the adaptability value of the finder and is used for marking the finder;
Figure FDA0003492931100000041
in formula (8), p (i) is the pheromone of the ith finder, i represents the ith finder, f (i) is the current fitness value of the ith finder, fminIs the finder fitness value with the smallest value, fmaxIs the finder fitness value with the largest value;
step 5520: calculating the sensitivity of the jth follower, wherein each follower has sensitivity to pheromones, the sensitivity is different in the optimization process, the selection range is expanded by adopting a sensitivity-pheromone matching mode, and the sensitivity calculation is shown as a formula (9):
S(j)=Smin+△Sj (9)
in formula (9), Δ Sj=(Smax-Smin)·Rand(0,1),Smax=P(i)max,Smin=P(i)min,P(i)maxIs the pheromone with the largest current value, P (i)minIs the pheromone with the smallest current value, SmaxIs the sensitivity, S, at which the current value is the maximumminIs the sensitivity at which the current value is the minimum;
step 5530: finding the finder i matched with the sensitivity of the jth follower: randomly finding out j, and satisfying P (i) ═ S (j);
step 5600: the follower individual positions are updated according to the constraints set forth for the followers in step 5500 using equation (10):
Figure FDA0003492931100000042
in the formula (10), the first and second groups,
Figure FDA0003492931100000043
indicating the position information of the ith follower in the jth dimension of the tth generation, t representing the current iteration number,
Figure FDA0003492931100000044
is the optimal position occupied by the discoverer of the t +1 generation,
Figure FDA0003492931100000045
then represents the current global worst finder position, d is the dimension, Q is a random number satisfying the standard normal distribution, a ∈ (-1, 1);
step 5700: updating the boundary individual positions according to equation (11):
Figure FDA0003492931100000046
in the formula (11), the reaction mixture,
Figure FDA0003492931100000051
representing the position information of the ith boundary individual of the tth generation in the jth dimension, wherein t represents the current iteration number,
Figure FDA0003492931100000052
is the current global optimum position; β is a random number that follows a normal distribution; k ∈ [ -1,1]Is a random number, fiIs the fitness value of the current sparrow individual; f. ofgAnd fwThe current global best and worst fitness values, respectively; ε is the minimum constant to avoid a denominator of zero; k represents the moving direction of the sparrows and is also a step length control parameter;
step 5800: judging whether the current value reaches a good enough fitness value or reaches the maximum iteration number, if so, terminating the program, and outputting the optimal group of solutions
Figure FDA0003492931100000053
Thereby obtaining the optimized Gaussian process hyper-parameter; otherwise, adding 1 to the iteration times, and jumping to the step 5400 to continue searching;
step 6000: inputting the training set of the step 4000 into the optimized Gaussian process model of the step 5000 for model training;
step 7000: inputting the verification set of the step 4000 into the trained prediction model of the step 6000 for model verification, and taking the formula (12) that the average absolute error does not exceed e-10As a validation standard:
Figure FDA0003492931100000054
in the formula (12), MAE is the average absolute error, and n is the number of samples in the verification set; y isiIs an actual value;
Figure FDA0003492931100000055
is a predicted value;
and when the minimum error or the maximum training times of the Gaussian process is reached, obtaining a final coal dust boiler NOx emission prediction model based on the improved SSA-GPR.
2. An apparatus based on the improved method for predicting NOx emission of a coal dust boiler in SSA-GPR as claimed in claim 1, wherein the apparatus comprises:
a data acquisition module: the method is used for collecting the characteristic parameter historical data and NOx emission of the coal-fired boiler in the step 1000;
a data processing module: the data preprocessing module is used for preprocessing the first data set, and realizing the data normalization in the step 2000 and the data dimension reduction in the step 3000;
a training module: for building a model for predicting NOx emissions from an improved SSA-GPR based boiler, training the model for improving SSA-GPR based boiler NOx emissions by a training set in the third data set in step 4000, and verifying the accuracy of the model for improving SSA-GPR based prediction by a verification set in step 4000;
a prediction module: preprocessing real-time detection data in the operation of the boiler to obtain a data sample, inputting the data sample into a trained NOx emission prediction model based on the improved SSA-GPR, and finally obtaining a NOx emission prediction result.
CN202210102566.XA 2022-01-27 2022-01-27 Pulverized coal boiler NOx emission prediction method and device based on improved SSA-GPR Active CN114464266B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210102566.XA CN114464266B (en) 2022-01-27 2022-01-27 Pulverized coal boiler NOx emission prediction method and device based on improved SSA-GPR

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210102566.XA CN114464266B (en) 2022-01-27 2022-01-27 Pulverized coal boiler NOx emission prediction method and device based on improved SSA-GPR

Publications (2)

Publication Number Publication Date
CN114464266A true CN114464266A (en) 2022-05-10
CN114464266B CN114464266B (en) 2022-08-02

Family

ID=81411742

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210102566.XA Active CN114464266B (en) 2022-01-27 2022-01-27 Pulverized coal boiler NOx emission prediction method and device based on improved SSA-GPR

Country Status (1)

Country Link
CN (1) CN114464266B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140072233A1 (en) * 2012-09-12 2014-03-13 Numerica Corporation Methods and systems for updating a predicted location of an object in a multi-dimensional space
CN111401783A (en) * 2020-04-07 2020-07-10 国网山东省电力公司 Power system operation data integration feature selection method
CN113129266A (en) * 2021-03-22 2021-07-16 太原科技大学 Stainless steel weld defect detection method based on multi-domain expression data enhancement and model self-optimization
CN113269365A (en) * 2021-06-02 2021-08-17 西安建筑科技大学 Short-term air conditioner load prediction method and system based on sparrow optimization algorithm
CN113362911A (en) * 2021-03-22 2021-09-07 江苏省镔鑫钢铁集团有限公司 IPSO-HKELM-based blast furnace molten iron silicon content prediction method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140072233A1 (en) * 2012-09-12 2014-03-13 Numerica Corporation Methods and systems for updating a predicted location of an object in a multi-dimensional space
CN111401783A (en) * 2020-04-07 2020-07-10 国网山东省电力公司 Power system operation data integration feature selection method
CN113129266A (en) * 2021-03-22 2021-07-16 太原科技大学 Stainless steel weld defect detection method based on multi-domain expression data enhancement and model self-optimization
CN113362911A (en) * 2021-03-22 2021-09-07 江苏省镔鑫钢铁集团有限公司 IPSO-HKELM-based blast furnace molten iron silicon content prediction method
CN113269365A (en) * 2021-06-02 2021-08-17 西安建筑科技大学 Short-term air conditioner load prediction method and system based on sparrow optimization algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GUIYUN LIU 等: "A Modified Sparrow Search Algorithm with Application in 3d Route Planning for UAV", 《SENSORS》 *
周欣欣 等: "一种基于QoS的移动分布式文件共享系统体系结构模型设计", 《无线互联科技》 *
蒋昕等: "基于核慢特征回归与互信息的常压塔软测量建模", 《化工学报》 *

Also Published As

Publication number Publication date
CN114464266B (en) 2022-08-02

Similar Documents

Publication Publication Date Title
CN107703480B (en) Mixed kernel function indoor positioning method based on machine learning
CN111814956B (en) Multi-task learning air quality prediction method based on multi-dimensional secondary feature extraction
CN111815033A (en) Offshore wind power prediction method based on RCNN and meteorological time sequence characteristics
CN112884056A (en) Optimized LSTM neural network-based sewage quality prediction method
CN114449452B (en) Wi-Fi indoor positioning method based on CNN-RNN
CN107992645B (en) Sewage treatment process soft measurement modeling method based on chaos-firework hybrid algorithm
CN113255432B (en) Turbine vibration fault diagnosis method based on deep neural network and manifold alignment
CN115099296A (en) Sea wave height prediction method based on deep learning algorithm
CN112149883A (en) Photovoltaic power prediction method based on FWA-BP neural network
Li et al. Deep spatio-temporal wind power forecasting
CN115496100A (en) Perimeter security disturbance identification algorithm based on GAF-ConvNeXt-TF
Song et al. Importance weighted expectation-maximization for protein sequence design
CN114464266B (en) Pulverized coal boiler NOx emission prediction method and device based on improved SSA-GPR
CN111061151B (en) Distributed energy state monitoring method based on multivariate convolutional neural network
CN117117859A (en) Photovoltaic power generation power prediction method and system based on neural network
CN114459052B (en) Coal-fired boiler NOx emission optimization method and device based on improved SSA
CN111582567B (en) Wind power probability prediction method based on hierarchical integration
CN115545316A (en) Atmospheric ozone prediction method based on deep learning hybrid model
CN112801350B (en) Uncertainty-based deep learning ultra-short-time wind power prediction system and method
CN111797979A (en) Vibration transmission system based on LSTM model
CN113269217A (en) Radar target classification method based on Fisher criterion
CN115860056B (en) Sensor array neural network method for mixed gas concentration prediction
CN113449466B (en) Solar radiation prediction method and system for optimizing RELM based on PCA and chaos GWO
CN117909747A (en) AUV longitude and latitude track prediction method based on EC-UNet and EC-TNet network
Zheng et al. Combustion process modeling based on deep sparse least squares support vector regression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant