CN112420132A - Product quality optimization control method in gasoline catalytic cracking process - Google Patents

Product quality optimization control method in gasoline catalytic cracking process Download PDF

Info

Publication number
CN112420132A
CN112420132A CN202011180154.5A CN202011180154A CN112420132A CN 112420132 A CN112420132 A CN 112420132A CN 202011180154 A CN202011180154 A CN 202011180154A CN 112420132 A CN112420132 A CN 112420132A
Authority
CN
China
Prior art keywords
data
variables
octane number
variable
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011180154.5A
Other languages
Chinese (zh)
Inventor
毛永芳
黄秋娟
柴毅
张镠
曾建学
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202011180154.5A priority Critical patent/CN112420132A/en
Publication of CN112420132A publication Critical patent/CN112420132A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/10Analysis or design of chemical reactions, syntheses or processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Abstract

A gasoline catalytic cracking process product quality optimization control method comprises the steps of collecting historical data of M times of gasoline catalytic cracking processes to obtain original data containing M samples; performing K-means clustering on the M original data to obtain K data sets, preprocessing the data of each data set by adopting the same data cleaning method to obtain processed data, and constructing an optimization control model through the processed data; the method comprises the steps of collecting gasoline to be catalytically cracked as verification data, matching the verification data into a data set of an optimization control model, obtaining technological parameters needing to be optimized in the gasoline catalytic cracking process through the optimization control model, adjusting the technological parameters and carrying out gasoline catalytic cracking to obtain a final optimized product.

Description

Product quality optimization control method in gasoline catalytic cracking process
Technical Field
The invention relates to the field of gasoline refining, in particular to a product quality optimization control method in a gasoline catalytic cracking process.
Background
About 70 percent of the Chinese imported crude oil is intermediate base or naphthenic base crude oil, and the crude oil is characterized by high sulfur content. The quality of gasoline has important influence on the performance and emission of automobiles, and it is very critical to reduce atmospheric pollution, produce high-cleanness gasoline and reduce the sulfur content in the gasoline. Octane number is the most important index for reflecting the combustion performance of gasoline and is used as a commercial brand of gasoline, and a series of operations in the process of carrying out desulfurization and olefin reduction on catalytic cracking gasoline in the prior art can cause the reduction of the octane number in crude oil, so that the quality of the gasoline is reduced. However, when the catalytic cracking gasoline is refined, the whole operation process is controllable, the prior art carries out desulfurization and olefin reduction on the whole process according to certain specifications, and the adjustment target is in accordance with the safety range of the current operation. Thus, although the target product can be obtained, the octane number loss of the product has large fluctuation, and the influence on the quality of the gasoline is large. In addition, although the traditional octane number measurement is accurate, the time consumption is too long, the operation is complicated, and the control of related enterprises on the quality of gasoline is directly influenced. According to relevant documents and a large amount of enterprise measured data, the loss of about 150 yuan/ton is equivalent to the loss of each 1 unit of octane number. Taking a 100 ten thousand ton/year catalytic cracking gasoline refining device as an example, if the octane number loss can be reduced by 0.3 unit, the economic benefit can reach four thousand five million yuan. In the catalytic cracking desulfurization process, on the premise of ensuring that the sulfur content of the finished product is lower than a certain index, the method has very important significance in improving the octane value content as much as possible. However, due to the complexity of the oil refining process and the diversity of equipment, the related influence variables have the characteristics of high dimensionality, high coupling, nonlinearity and cluster distribution, and the catalytic cracking process has great hysteresis, so that the process parameters cannot be adjusted in time according to the quality of the output product to optimize the product quality.
Disclosure of Invention
The invention aims to provide a product quality optimization control method in a gasoline catalytic cracking process,
the invention aims to realize the technical scheme that 1) the original data is acquired: collecting historical data of M times of gasoline catalytic cracking processes to obtain original data T { (X) containing M samples1,Y1,S1),(X2,Y2,S2),...,(XM,YM,SM) In which X isi=(x1,x2,...,xN)i=[1~M]Containing N in gasoline catalytic cracking process1Characteristic variables of individual material parameters and N2Characteristic variables of the process operating parameters; y isiIs the measured octane number loss value, SiIs the measured sulfur content of the product;
2) data processing: performing K-means clustering on the M original data T acquired in the step 1) to obtain K data sets, preprocessing the data of each data set by adopting the same data cleaning method to obtain processed data, and then performing data processing on each data set according to the following steps of 9: 1 is randomly divided into a training group and a testing group;
3) constructing an optimization control model: aiming at a training group of a single data set, constructing Q octane number loss prediction models and Q sulfur content prediction models based on a recursive characteristic variable elimination algorithm and a random forest, bringing test set data into the Q octane number loss prediction models and the Q sulfur content prediction models, respectively calculating loss functions of the models, selecting the optimal octane number loss prediction model and the optimal sulfur content prediction model of the data set according to the loss functions, and obtaining the octane number loss prediction models and the sulfur content prediction models of K data sets by the same method to obtain an optimized control model;
4) optimizing and controlling the product quality: acquiring verification data Z of gasoline to be catalytically cracked, matching the verification data into the data set of the optimization control model obtained in the step 3) according to the Euclidean distance minimum principle from a clustering center, taking characteristic variables belonging to operation parameters in the verification data Z as decision variables, and obtaining the optimal solution of the decision variables through the optimization control model; and adjusting each process parameter of the gasoline catalytic cracking process according to the optimal solution of the decision variable, and performing gasoline catalytic cracking to obtain a final optimized product.
Further, the specific steps of data processing in step 2) are as follows:
2-1) original data diversity: randomly selecting K samples from the raw data T collected in the step 1) as an initial mean vector mu12,...,μk}; calculate each sample XiWith each mean vector mujA distance of XiInscribe the nearest mujCorresponding data set CjPerforming the following steps; computing a data set CjNew mean vector μ'j: mu.s ofj≠μ′jThen mu's'jIs given to mujIteratively updating the mean vector; if: mu.sj=μ′jThen the output data set C ═ C1,C2,...,CK};
2-2) data preprocessing: sample t of the single dataset obtained in step 2-1)iPerforming longitudinal processing to remove the sample tiFor variable xLPerforming transverse processing to remove variables with high abnormal incidence rate to obtain a preprocessed data set Dj{(Xj,Yj,Sj)}(j=1,2,...,k);
2-3) data classification: for a single data set D after the pre-processing of step 2-2)j{(Xj,Yj,Sj) Data of (j ═ 1, 2.., k) are randomly sorted, where: using 90% of samples as training population and the rest 10% of samples as testing population to obtain training set data Dtr{Xtr,Ytr,StrTest set data Dte{Xte,Yte,Ste};
Wherein Xtr{(X11,X12),(X21,X22),...,(Xk1,Xk2)},Xj1,
Figure BDA0002749943440000021
Xj1,Xj2Respectively representing variables belonging to the material properties and the operating parameters in the characteristic variables.
Further, the specific steps of the original data diversity in step 2-1) are as follows:
2-1-1) randomly selecting K samples from the raw data T collected in step 1) as an initial mean vector [ mu ]12,...,μk};
2-1-2) calculating Each sample X in the raw data Ti(i 1-M) and each mean vector muj(j is 1 to k):
dij=||Xij||2 (1)
mixing XiInscribe the nearest mujCorresponding data set CjPerforming the following steps;
2-1-3) computing the data set CjNew mean vector μ'j
Figure BDA0002749943440000031
If: mu.sj≠μ′jThen mu's'jIs given to mujReturning to the step 2-1-2) to iteratively update the mean vector;
if: mu.sj=μ′jThen the output data set C ═ C1,C2,...,CKTherein of
Figure BDA0002749943440000032
mjIs the number of samples of the jth class of data set, where each sample ti(i=1,2,...,mj) Containing N characteristic variables xL(L=1,2,...,N)。
Further, the specific steps of preprocessing the data in the step 2-2) are as follows:
2-2-1) samples t on a single datasetiPerforming longitudinal processing to calculate single data set sample tiArithmetic mean of
Figure BDA0002749943440000039
And residual error v of a single sampleiAccording to Bessel formulaThe standard error σ is calculated:
Figure BDA0002749943440000033
Figure BDA0002749943440000034
Figure BDA0002749943440000035
if a certain measured value tbIs of a residual error vbSatisfy the requirement of
Figure BDA0002749943440000036
Then consider tbThe error value is a bad value containing a large error value, and the marking position in the corresponding homotype all-zero matrix is 1;
and calculating the abnormal occurrence rate p of the sample variable according to the mark matrixeiAnd rate of sample anomaly measurement pxe:
Figure BDA0002749943440000037
Figure BDA0002749943440000038
In the formula (7), P represents a sample xLThe number of variables in the abnormal sampling value;
adjusting the measurement rate pxeThe threshold value is directly removed, and the abnormal data of the samples which do not exceed the threshold value are replaced by adopting a mean value interpolation method;
2-2-2) for variable xLPerforming transverse processing, and measuring each variable x by using Pearson correlation coefficient rhoL(L ═ 1, 2.., N) correlation with the key target variable y/s of the band study, the formula calculated is as follows:
Figure BDA0002749943440000041
Figure BDA0002749943440000042
correlation coefficient rho for each variable simultaneouslyxy、ρxsAnd abnormal incidence p of variableseiSetting a threshold value, and eliminating variables with higher abnormal incidence rate through the threshold value;
2-2-3) carrying out normalization treatment on the data processed in the step 2-2-1) and the step 2-2-2):
Figure BDA0002749943440000043
obtaining a pre-processed data set Dj{(Xj,Yj,Sj)}(j=1,2,...,k)。
Further, the specific steps of constructing the optimal control model in the step 3) are as follows:
3-1) utilizing N characteristic variables X in training population datatrAnd octane number loss YtrConstructing a random forest-based octane number loss prediction model f, and combining characteristic variables x of the modelNj(ii) a Using test set data XteAnd YteCalculating the loss function root mean square error of the octane number loss prediction model:
Figure BDA0002749943440000044
in formula (11), yiThe true value of octane number loss for the ith sample,
Figure BDA0002749943440000045
the corresponding model estimated value is taken as the model estimated value;
3-2) sequentially calculating the importance of N characteristic variables in the training set data and the characteristic variable X of the training populationtrAnd octane number loss YtrObeying a gaussian distribution:
Figure BDA0002749943440000046
in formula (12), τ ═ τ (τ)12,...,τN)Ti=C(Xi,Y),C=[C(Xi,Yj)]I.e. XtrThe covariance matrix of (a);
the importance of the ith feature variable:
Figure BDA0002749943440000047
in the formula (13), αi=[C-1τ]V () represents a calculation variable variance function;
3-3) importance of N characteristic variables I (x)i) Sorting, and deleting c characteristic variables with low importance to obtain a new combination of N 'characteristic variables, wherein N' ═ N-c;
3-4) returning the new characteristic variable combination N' to the step 3-1), and repeating the steps 3-1) and 3-3) until the number of the characteristic variables N is 0, so as to obtain Q octane number loss prediction models F ═ (F ═ 0)1,f2,...,fQ) And Q characteristic variable combinations v ═ v (v)1,v2,...,vQ);
3-5) comparing the root mean square error of the loss functions of the Q prediction models, selecting the octane number loss prediction model with the minimum root mean square error of the loss function as the octane number loss prediction model of the data set and obtaining the characteristic variable combination of the models, wherein the variables belonging to the material properties in the characteristic variable combination are recorded as vy1In the combination of characteristic variables, the variable belonging to an operating parameter is denoted vy2
3-6) repeating the steps 3-1) -3-5) aiming at the k data sets, so as to obtain octane number loss prediction models of the k data sets and corresponding characteristic variable combinations thereof;
3-7) D Using training set datatr={Xtr,StrTest set data Dte={Xte,SteObtaining a product sulfur content prediction model S of K data sets according to the methods of the steps 3-1) to 3-6)jAnd its corresponding variable v which is a property of the feedstocks1And variables v belonging to the operating parameterss2
Further, the specific steps of performing product quality optimization control on the gasoline catalytic cracking process by using the octane number loss prediction model and the sulfur content prediction model in the step 4) are as follows:
4-1) collecting data Z of gasoline to be catalytically cracked, and calculating Euclidean distance d between verification data Z and each subset clustering centerj(j ═ 1, 2.. once, k), according to the Euclidean distance minimum principle from the verification data to the clustering center, matching the verification data Z to the data set corresponding to the optimization control model constructed in the step 3), and obtaining a corresponding octane number loss prediction model f (Z) (Z1,Z2) And sulfur content prediction model S (Z)S1,ZS2) Wherein Z is1For characteristic variables, Z, belonging to the properties of the feedstock in the model for the prediction of octane number loss2For the characteristic variables belonging to the operating parameters in the model for predicting octane number loss, ZS1For the characteristic variable, Z, of the nature of the feedstock in the model for predicting the sulphur content of the finished productS2Characteristic variables belonging to the operation parameters in the finished product sulfur content prediction model are obtained;
4-2) predicting model f (Z) according to octane number loss in step 4-1)1,Z2) And sulfur content prediction model S (Z)S1,ZS2) Fixing the characteristic variable Z belonging to the nature of the feedstock1And ZS1Unchanged as a characteristic variable Z belonging to the operating parameter2And ZS2Simultaneously optimizing an octane number loss prediction model and a finished product sulfur content prediction model for decision variables, wherein an optimization objective function of the multi-objective optimization problem is as follows:
Figure BDA0002749943440000051
in the formula (14), the Δ set represents the adjustable range of each operating parameterAnd using genetic algorithm to obtain global optimum solution
Figure BDA0002749943440000052
And
Figure BDA0002749943440000061
the decision variable for optimizing the quality of gasoline catalytic cracking products is a characteristic variable Z belonging to an operating parameter in an octane number loss prediction model2And the characteristic variable Z belonging to the operation parameter in the finished product sulfur content prediction modelS2Union of (1), i.e. optimal solution of decision variables
Figure BDA0002749943440000062
4-3) adjusting process parameters: solution Z obtained according to the optimization in step 4-2)*Adjusting technological parameters of the gasoline catalytic cracking process, and performing gasoline catalytic cracking to obtain a final optimized product.
Due to the adoption of the technical scheme, the invention has the following advantages:
1. the method is different from the traditional data association or mechanism modeling, and the factors most relevant to the octane number content of the finished product are mined from a large amount of data, so that the influence of data noise on model construction and prediction is weakened to the maximum extent;
2. clustering is carried out on the data sets which are distributed in a cluster shape and have more complex distribution, the complexity of data distribution is reduced, a prediction model is established for each type, and then the fitting pressure of the prediction model is reduced;
3. the product quality can be predicted in time according to the product quality prediction model, the defect of production flow delay is avoided, and the process parameters of the optimal target are solved by utilizing a multi-target optimization algorithm and the prediction result, so that the process parameters can be adjusted in time to achieve the purpose of optimal product quality.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof.
Drawings
The drawings of the present invention are described below.
FIG. 1 is a flow chart of the present invention.
Detailed Description
The invention is further illustrated by the following figures and examples.
A method for optimizing and controlling the quality of products in the gasoline catalytic cracking process comprises the following steps:
1) collecting original data: collecting historical data of M times of gasoline catalytic cracking processes to obtain original data T { (X) containing M samples1,Y1,S1),(X2,Y2,S2),...,(XM,YM,SM) In which X isi=(x1,x2,...,xN)i=[1~M]Containing N in gasoline catalytic cracking process1Characteristic variables of individual material parameters and N2Characteristic variables of the process operating parameters; y isiIs the measured octane number loss value, SiIs the measured sulfur content of the product;
2) data processing: performing K-means clustering on the M original data T acquired in the step 1) to obtain K data sets, preprocessing the data of each data set by adopting the same data cleaning method to obtain processed data, and then performing data processing on each data set according to the following steps of 9: 1 is randomly divided into a training group and a testing group; the method comprises the following specific steps:
2-1) original data diversity: randomly selecting K samples from the raw data T collected in the step 1) as an initial mean vector mu12,...,μk}; calculate each sample XiWith each mean vector mujA distance of XiInscribe the nearest mujCorresponding data set CjPerforming the following steps; computing a data set CjNew mean vector μ'j: mu.s ofj≠μ′jThen mu's'jIs given to mujTo, forIteratively updating the mean vector; if: mu.sj=μ′jThen the output data set C ═ C1,C2,...,CKThe method comprises the following specific steps:
2-1-1) randomly selecting K samples from the raw data T collected in step 1) as an initial mean vector [ mu ]12,...,μk};
2-1-2) calculating Each sample X in the raw data Ti(i 1-M) and each mean vector muj(j is 1 to k):
dij=||Xij||2 (15)
mixing XiInscribe the nearest mujCorresponding data set CjPerforming the following steps;
2-1-3) computing the data set CjNew mean vector μ'j
Figure BDA0002749943440000071
If: mu.sj≠μ′jThen mu's'jIs given to mujReturning to the step 2-1-2) to iteratively update the mean vector;
if: mu.sj=μ′jThen the output data set C ═ C1,C2,...,CKTherein of
Figure BDA0002749943440000072
mjIs the number of samples of the jth class of data set, where each sample ti(i=1,2,...,mj) Containing N characteristic variables xL(L=1,2,...,N)。
2-2) data preprocessing: sample t of the single dataset obtained in step 2-1)iPerforming longitudinal processing to remove the sample tiFor variable xLPerforming transverse processing to remove variables with high abnormal incidence rate to obtain a preprocessed data set Dj{(Xj,Yj,Sj)}(j=1,2,...,k);The method comprises the following specific steps:
2-2-1) samples t on a single datasetiPerforming longitudinal processing to calculate single data set sample tiArithmetic mean of
Figure BDA0002749943440000075
And residual error v of a single sampleiAnd calculating the standard error sigma according to a Bessel formula:
Figure BDA0002749943440000073
Figure BDA0002749943440000074
Figure BDA0002749943440000081
if a certain measured value tbIs of a residual error vbSatisfy the requirement of
Figure BDA0002749943440000082
Then consider tbThe error value is a bad value containing a large error value, and the marking position in the corresponding homotype all-zero matrix is 1;
and calculating the abnormal occurrence rate p of the sample variable according to the mark matrixeiAnd rate of sample anomaly measurement pxe:
Figure BDA0002749943440000083
Figure BDA0002749943440000084
In the formula (21), P represents a sample xLThe number of variables in the abnormal sampling value;
adjusting the measurement rate pxeThreshold value, for exceeding threshold valueDirectly removing samples, and replacing abnormal data of the samples which do not exceed the threshold value by adopting a mean value interpolation method;
2-2-2) for variable xLPerforming transverse processing, and measuring each variable x by using Pearson correlation coefficient rhoL(L ═ 1, 2.., N) correlation with the key target variable y/s of the band study, the formula calculated is as follows:
Figure BDA0002749943440000085
Figure BDA0002749943440000086
correlation coefficient rho for each variable simultaneouslyxy、ρxsAnd abnormal incidence p of variableseiSetting a threshold value, and eliminating variables with higher abnormal incidence rate through the threshold value;
2-2-3) carrying out normalization treatment on the data processed in the step 2-2-1) and the step 2-2-2):
Figure BDA0002749943440000087
obtaining a pre-processed data set Dj{(Xj,Yj,Sj)}(j=1,2,...,k)。
2-3) data classification: for a single data set D after the pre-processing of step 2-2)j{(Xj,Yj,Sj) Data of (j ═ 1, 2.., k) are randomly sorted, where: using 90% of samples as training population and the rest 10% of samples as testing population to obtain training set data Dtr{Xtr,Ytr,StrTest set data Dte{Xte,Yte,Ste};
Wherein Xtr{(X11,X12),(X21,X22),...,(Xk1,Xk2)},
Figure BDA0002749943440000088
,Xj1,Xj2And respectively representing variables belonging to the material property and the operation parameter in the characteristic variables, and the data of the test set are the same.
3) Constructing an optimization control model: aiming at a training group of a single data set, constructing Q octane number loss prediction models and Q sulfur content prediction models based on a recursive characteristic variable elimination algorithm and a random forest, bringing test set data into the Q octane number loss prediction models and the Q sulfur content prediction models, respectively calculating loss functions of the models, selecting the optimal octane number loss prediction model and the optimal sulfur content prediction model of the data set according to the loss functions, and obtaining the octane number loss prediction models and the sulfur content prediction models of K data sets by the same method to obtain an optimized control model; the method comprises the following specific steps:
3-1) utilizing N characteristic variables X in training population datatrAnd octane number loss YtrConstructing a random forest-based octane number loss prediction model f, and combining characteristic variables x of the modelNj(ii) a Using test set data XteAnd YteCalculating the loss function root mean square error of the octane number loss prediction model:
Figure BDA0002749943440000091
in the formula (25), yiThe true value of octane number loss for the ith sample,
Figure BDA0002749943440000092
the corresponding model estimated value is taken as the model estimated value;
3-2) sequentially calculating the importance of N characteristic variables in the training set data and the characteristic variable X of the training populationtrAnd octane number loss YtrObeying a gaussian distribution:
Figure BDA0002749943440000093
in the formula (26), τ=(τ12,...,τN)Ti=C(Xi,Y),C=[C(Xi,Yj)]I.e. XtrThe covariance matrix of (a);
the importance of the ith feature variable:
Figure BDA0002749943440000094
in the formula (27), αi=[C-1τ]V () represents a calculation variable variance function;
3-3) importance of N characteristic variables I (x)i) Sorting, and deleting c characteristic variables with low importance to obtain a new combination of N 'characteristic variables, wherein N' ═ N-c;
3-4) returning the new characteristic variable combination N' to the step 3-1), and repeating the steps 3-1) and 3-3) until the number of the characteristic variables N is 0, so as to obtain Q octane number loss prediction models F ═ (F ═ 0)1,f2,...,fQ) And Q characteristic variable combinations v ═ v (v)1,v2,...,vQ);
3-5) comparing the root mean square error of the loss functions of the Q prediction models, selecting the octane number loss prediction model with the minimum root mean square error of the loss function as the octane number loss prediction model of the data set and obtaining the characteristic variable combination of the models, wherein the variables belonging to the material properties in the characteristic variable combination are recorded as vy1In the combination of characteristic variables, the variable belonging to an operating parameter is denoted vy2
3-6) repeating the steps 3-1) -3-5) aiming at the k data sets, so as to obtain octane number loss prediction models of the k data sets and corresponding characteristic variable combinations thereof;
3-7) D Using training set datatr={Xtr,StrTest set data Dte={Xte,SteObtaining a product sulfur content prediction model S of K data sets according to the methods of the steps 3-1) to 3-6)jAnd its corresponding variable v which is a property of the feedstocks1And variables v belonging to the operating parameterss2
4) Optimizing and controlling the product quality: acquiring verification data Z of gasoline to be catalytically cracked, matching the verification data into the data set of the optimization control model obtained in the step 3) according to the Euclidean distance minimum principle from a clustering center, taking characteristic variables belonging to operation parameters in the verification data Z as decision variables, and obtaining the optimal solution of the decision variables through the optimization control model; adjusting each process parameter of the gasoline catalytic cracking process according to the optimal solution of the decision variable, and performing gasoline catalytic cracking to obtain a final optimized product; the method comprises the following specific steps:
4-1) collecting data Z of gasoline to be catalytically cracked, and calculating Euclidean distance d between verification data Z and each subset clustering centerj(j ═ 1, 2.. once, k), according to the Euclidean distance minimum principle from the verification data to the clustering center, matching the verification data Z to the data set corresponding to the optimization control model constructed in the step 3), and obtaining a corresponding octane number loss prediction model f (Z) (Z1,Z2) And sulfur content prediction model S (Z)S1,ZS2) Wherein Z is1For characteristic variables, Z, belonging to the properties of the feedstock in the model for the prediction of octane number loss2For the characteristic variables belonging to the operating parameters in the model for predicting octane number loss, ZS1For the characteristic variable, Z, of the nature of the feedstock in the model for predicting the sulphur content of the finished productS2Characteristic variables belonging to the operation parameters in the finished product sulfur content prediction model are obtained;
4-2) predicting model f (Z) according to octane number loss in step 4-1)1,Z2) And sulfur content prediction model S (Z)S1,ZS2) Fixing the characteristic variable Z belonging to the nature of the feedstock1And ZS1Unchanged as a characteristic variable Z belonging to the operating parameter2And ZS2Simultaneously optimizing an octane number loss prediction model and a finished product sulfur content prediction model for decision variables, wherein an optimization objective function of the multi-objective optimization problem is as follows:
Figure BDA0002749943440000101
in the formula (28), each set of Δ representsThe adjustable range of each operation parameter is used for solving the global optimal solution by utilizing a genetic algorithm
Figure BDA0002749943440000102
And
Figure BDA0002749943440000103
the decision variable for optimizing the quality of gasoline catalytic cracking products is a characteristic variable Z belonging to an operating parameter in an octane number loss prediction model2And the characteristic variable Z belonging to the operation parameter in the finished product sulfur content prediction modelS2Union of (1), i.e. optimal solution of decision variables
Figure BDA0002749943440000111
4-3) adjusting process parameters: solution Z obtained according to the optimization in step 4-2)*Adjusting technological parameters of the gasoline catalytic cracking process, and performing gasoline catalytic cracking to obtain a final optimized product.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (6)

1. A method for optimizing and controlling the quality of products in the gasoline catalytic cracking process is characterized by comprising the following specific steps:
1) collecting original data: collecting historical data of M times of gasoline catalytic cracking processes to obtain original data T { (X) containing M samples1,Y1,S1),(X2,Y2,S2),...,(XM,YM,SM) In which X isi=(x1,x2,...,xN)i=[1~M]Comprising gasoline catalytic crackingIn the process N1Characteristic variables of individual material parameters and N2Characteristic variables of the process operating parameters; y isiIs the measured octane number loss value, SiIs the measured sulfur content of the product;
2) data processing: performing K-means clustering on the M original data T acquired in the step 1) to obtain K data sets, preprocessing the data of each data set by adopting the same data cleaning method to obtain processed data, and then performing data processing on each data set according to the following steps of 9: 1 is randomly divided into a training group and a testing group;
3) constructing an optimization control model: aiming at a training group of a single data set, constructing Q octane number loss prediction models and Q sulfur content prediction models based on a recursive characteristic variable elimination algorithm and a random forest, bringing test set data into the Q octane number loss prediction models and the Q sulfur content prediction models, respectively calculating loss functions of the models, selecting the optimal octane number loss prediction model and the optimal sulfur content prediction model of the data set according to the loss functions, and obtaining the octane number loss prediction models and the sulfur content prediction models of K data sets by the same method to obtain an optimized control model;
4) optimizing and controlling the product quality: acquiring verification data Z of gasoline to be catalytically cracked, matching the verification data into the data set of the optimization control model obtained in the step 3) according to the Euclidean distance minimum principle from a clustering center, taking characteristic variables belonging to operation parameters in the verification data Z as decision variables, and obtaining the optimal solution of the decision variables through the optimization control model; and adjusting each process parameter of the gasoline catalytic cracking process according to the optimal solution of the decision variable, and performing gasoline catalytic cracking to obtain a final optimized product.
2. The method for optimizing and controlling the product quality of the gasoline catalytic cracking process as claimed in claim 1, wherein the data processing in the step 2) comprises the following steps:
2-1) original data diversity: randomly selecting K samples from the raw data T collected in the step 1) as an initial mean vector mu12,...,μk}; calculate each sampleXiWith each mean vector mujA distance of XiInscribe the nearest mujCorresponding data set CjPerforming the following steps; computing a data set CjNew mean vector μ'j: mu.s ofj≠μ′jThen mu's'jIs given to mujIteratively updating the mean vector; if: mu.sj=μ′jThen the output data set C ═ C1,C2,...,CK};
2-2) data preprocessing: sample t of the single dataset obtained in step 2-1)iPerforming longitudinal processing to remove the sample tiFor variable xLPerforming transverse processing to remove variables with high abnormal incidence rate to obtain a preprocessed data set Dj{(Xj,Yj,Sj)}(j=1,2,...,k);
2-3) data classification: for a single data set D after the pre-processing of step 2-2)j{(Xj,Yj,Sj) Data of (j ═ 1, 2.., k) are randomly sorted, where: using 90% of samples as training population and the rest 10% of samples as testing population to obtain training set data Dtr{Xtr,Ytr,StrTest set data Dte{Xte,Yte,Ste};
Wherein Xtr{(X11,X12),(X21,X22),...,(Xk1,Xk2)},
Figure FDA0002749943430000021
Xj1,Xj2Respectively representing variables belonging to the material properties and the operating parameters in the characteristic variables.
3. The method for optimizing and controlling the product quality of the gasoline catalytic cracking process as claimed in claim 2, wherein the step 2-1) comprises the following steps:
2-1-1) randomly selecting K samples from the raw data T collected in the step 1) as an initial mean valueVector mu12,...,μk};
2-1-2) calculating Each sample X in the raw data Ti(i 1-M) and each mean vector muj(j is 1 to k):
dij=||Xij||2 (1)
mixing XiInscribe the nearest mujCorresponding data set CjPerforming the following steps;
2-1-3) computing the data set CjNew mean vector μ'j
Figure FDA0002749943430000022
If: mu.sj≠μ′jThen mu's'jIs given to mujReturning to the step 2-1-2) to iteratively update the mean vector;
if: mu.sj=μ′jThen the output data set C ═ C1,C2,...,CKTherein of
Figure FDA0002749943430000023
mjIs the number of samples of the jth class of data set, where each sample ti(i=1,2,...,mj) Containing N characteristic variables xL(L=1,2,...,N)。
4. The method for optimizing and controlling the product quality of the gasoline catalytic cracking process as claimed in claim 2, wherein the step 2-2) of preprocessing the data comprises the following steps:
2-2-1) samples t on a single datasetiPerforming longitudinal processing to calculate single data set sample tiThe calculated average value t and the residual error v of a single sampleiAnd calculating the standard error sigma according to a Bessel formula:
Figure FDA0002749943430000024
Figure FDA0002749943430000025
Figure FDA0002749943430000026
if a certain measured value tbIs of a residual error vbSatisfy the requirement of
Figure FDA0002749943430000027
Then consider tbThe error value is a bad value containing a large error value, and the marking position in the corresponding homotype all-zero matrix is 1;
and calculating the abnormal occurrence rate p of the sample variable according to the mark matrixeiAnd rate of sample anomaly measurement pxe:
Figure FDA0002749943430000031
Figure FDA0002749943430000032
In the formula (7), P represents a sample xLThe number of variables in the abnormal sampling value;
adjusting the measurement rate pxeThe threshold value is directly removed, and the abnormal data of the samples which do not exceed the threshold value are replaced by adopting a mean value interpolation method;
2-2-2) for variable xLPerforming transverse processing, and measuring each variable x by using Pearson correlation coefficient rhoL(L ═ 1, 2.., N) correlation with the key target variable y/s of the band study, the formula calculated is as follows:
Figure FDA0002749943430000033
Figure FDA0002749943430000034
correlation coefficient rho for each variable simultaneouslyxy、ρxsAnd abnormal incidence p of variableseiSetting a threshold value, and eliminating variables with higher abnormal incidence rate through the threshold value;
2-2-3) carrying out normalization treatment on the data processed in the step 2-2-1) and the step 2-2-2):
Figure FDA0002749943430000035
obtaining a pre-processed data set Dj{(Xj,Yj,Sj)}(j=1,2,...,k)。
5. The method for optimizing and controlling the product quality of the gasoline catalytic cracking process according to claim 1, wherein the specific steps of constructing the optimization control model in the step 3) are as follows:
3-1) utilizing N characteristic variables X in training population datatrAnd octane number loss YtrConstructing a random forest-based octane number loss prediction model f, and combining characteristic variables x of the modelNj(ii) a Using test set data XteAnd YteCalculating the loss function root mean square error of the octane number loss prediction model:
Figure FDA0002749943430000036
in formula (11), yiThe true value of octane number loss for the ith sample,
Figure FDA0002749943430000037
the corresponding model estimated value is taken as the model estimated value;
3-2) sequentially calculating the importance of N characteristic variables in the training set data and the characteristic variable X of the training populationtrAnd octane number loss YtrObeying a gaussian distribution:
Figure FDA0002749943430000041
in formula (12), τ ═ τ (τ)12,...,τN)Ti=C(Xi,Y),C=[C(Xi,Yj)]I.e. XtrThe covariance matrix of (a);
the importance of the ith feature variable:
Figure FDA0002749943430000042
in the formula (13), αi=[C-1τ]V () represents a calculation variable variance function;
3-3) importance of N characteristic variables I (x)i) Sorting, and deleting c characteristic variables with low importance to obtain a new combination of N 'characteristic variables, wherein N' ═ N-c;
3-4) returning the new characteristic variable combination N' to the step 3-1), and repeating the steps 3-1) and 3-3) until the number of the characteristic variables N is 0, so as to obtain Q octane number loss prediction models F ═ (F ═ 0)1,f2,...,fQ) And Q characteristic variable combinations v ═ v (v)1,v2,...,vQ);
3-5) comparing the root mean square error of the loss functions of the Q prediction models, selecting the octane number loss prediction model with the minimum root mean square error of the loss function as the octane number loss prediction model of the data set and obtaining the characteristic variable combination of the models, wherein the variables belonging to the material properties in the characteristic variable combination are recorded as vy1In the combination of characteristic variables, the variable belonging to an operating parameter is denoted vy2
3-6) repeating the steps 3-1) -3-5) aiming at the k data sets, so as to obtain octane number loss prediction models of the k data sets and corresponding characteristic variable combinations thereof;
3-7) D Using training set datatr={Xtr,StrTest set data Dte={Xte,SteObtaining a product sulfur content prediction model S of K data sets according to the methods of the steps 3-1) to 3-6)jAnd its corresponding variable v which is a property of the feedstocks1And variables v belonging to the operating parameterss2
6. The method for optimizing and controlling the product quality in the gasoline catalytic cracking process according to claim 1, wherein the step 4) of performing the product quality optimization control in the gasoline catalytic cracking process by using the octane number loss prediction model and the sulfur content prediction model comprises the following specific steps:
4-1) collecting data Z of gasoline to be catalytically cracked, and calculating Euclidean distance d between verification data Z and each subset clustering centerj(j ═ 1, 2.. once, k), according to the Euclidean distance minimum principle from the verification data to the clustering center, matching the verification data Z to the data set corresponding to the optimization control model constructed in the step 3), and obtaining a corresponding octane number loss prediction model f (Z) (Z1,Z2) And sulfur content prediction model S (Z)S1,ZS2) Wherein Z is1For characteristic variables, Z, belonging to the properties of the feedstock in the model for the prediction of octane number loss2For the characteristic variables belonging to the operating parameters in the model for predicting octane number loss, ZS1For the characteristic variable, Z, of the nature of the feedstock in the model for predicting the sulphur content of the finished productS2Characteristic variables belonging to the operation parameters in the finished product sulfur content prediction model are obtained;
4-2) predicting model f (Z) according to octane number loss in step 4-1)1,Z2) And sulfur content prediction model S (Z)S1,ZS2) Fixing the characteristic variable Z belonging to the nature of the feedstock1And ZS1Unchanged as a characteristic variable Z belonging to the operating parameter2And ZS2Optimizing octane number loss prediction model and finished product sulfur content prediction model simultaneously for decision variables, and performing multi-objective optimizationThe optimization objective function of the problem is:
Figure FDA0002749943430000051
in the formula (14), the Δ set represents the adjustable range of each operation parameter, and a global optimal solution is obtained by using a genetic algorithm
Figure FDA0002749943430000052
And
Figure FDA0002749943430000053
the decision variable for optimizing the quality of gasoline catalytic cracking products is a characteristic variable Z belonging to an operating parameter in an octane number loss prediction model2And the characteristic variable Z belonging to the operation parameter in the finished product sulfur content prediction modelS2Union of (1), i.e. optimal solution of decision variables
Figure FDA0002749943430000054
4-3) adjusting process parameters: solution Z obtained according to the optimization in step 4-2)*Adjusting technological parameters of the gasoline catalytic cracking process, and performing gasoline catalytic cracking to obtain a final optimized product.
CN202011180154.5A 2020-10-29 2020-10-29 Product quality optimization control method in gasoline catalytic cracking process Pending CN112420132A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011180154.5A CN112420132A (en) 2020-10-29 2020-10-29 Product quality optimization control method in gasoline catalytic cracking process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011180154.5A CN112420132A (en) 2020-10-29 2020-10-29 Product quality optimization control method in gasoline catalytic cracking process

Publications (1)

Publication Number Publication Date
CN112420132A true CN112420132A (en) 2021-02-26

Family

ID=74841488

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011180154.5A Pending CN112420132A (en) 2020-10-29 2020-10-29 Product quality optimization control method in gasoline catalytic cracking process

Country Status (1)

Country Link
CN (1) CN112420132A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705020A (en) * 2021-09-14 2021-11-26 西南石油大学 Method for calculating octane number loss in gasoline catalytic cracking process

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130179235A1 (en) * 2010-08-18 2013-07-11 Manufacturing Technology Network Inc. Computer apparatus and method for integration of process planning optimization and control
US20140034551A1 (en) * 2011-04-15 2014-02-06 Petroleo Brasileiro S.A. - Petrobras Fcc process for maximizing diesel
CN104392098A (en) * 2014-10-27 2015-03-04 中国石油大学(北京) Method for predicting yield of catalytically cracked gasoline
CN107609328A (en) * 2017-08-30 2018-01-19 武汉理工大学 A kind of Multipurpose Optimal Method of catalytic cracking unit model
CN109192264A (en) * 2018-08-17 2019-01-11 联想(北京)有限公司 Construct method, system and the yield prediction method and system of yield prediction model
CN109343344A (en) * 2018-09-21 2019-02-15 北京天工智造科技有限公司 Cigarette machine operating parameter optimization method
CN109814513A (en) * 2019-03-20 2019-05-28 杭州辛孚能源科技有限公司 A kind of catalytic cracking unit optimization method based on data model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130179235A1 (en) * 2010-08-18 2013-07-11 Manufacturing Technology Network Inc. Computer apparatus and method for integration of process planning optimization and control
US20140034551A1 (en) * 2011-04-15 2014-02-06 Petroleo Brasileiro S.A. - Petrobras Fcc process for maximizing diesel
CN104392098A (en) * 2014-10-27 2015-03-04 中国石油大学(北京) Method for predicting yield of catalytically cracked gasoline
CN107609328A (en) * 2017-08-30 2018-01-19 武汉理工大学 A kind of Multipurpose Optimal Method of catalytic cracking unit model
CN109192264A (en) * 2018-08-17 2019-01-11 联想(北京)有限公司 Construct method, system and the yield prediction method and system of yield prediction model
CN109343344A (en) * 2018-09-21 2019-02-15 北京天工智造科技有限公司 Cigarette machine operating parameter optimization method
CN109814513A (en) * 2019-03-20 2019-05-28 杭州辛孚能源科技有限公司 A kind of catalytic cracking unit optimization method based on data model

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
杨帆等: "基于人工智能算法的催化裂化装置汽油收率预测模型的构建与分析", 《石油学报(石油加工)》, vol. 35, no. 04, pages 807 - 817 *
王伟等: "基于GBDT和新型P-GBDT算法的催化裂化装置汽油收率寻优模型的构建与应用", 《石油学报(石油加工)》, no. 01, pages 191 - 199 *
赵媛媛: "数据挖掘技术在MIP工艺汽油收率优化中的应用", 《中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑》, pages 019 - 255 *
赵浩: "石化企业生产与能量西戎集成建模与优化分析", 《中国博士学位论文全文数据库 工程科技Ⅰ辑》, pages 019 - 20 *
钱坤: "用含硫及高硫原油生产清洁汽油", 《炼油技术与工程》, no. 08, pages 34 - 37 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705020A (en) * 2021-09-14 2021-11-26 西南石油大学 Method for calculating octane number loss in gasoline catalytic cracking process

Similar Documents

Publication Publication Date Title
CN112489733B (en) Octane number loss prediction method based on particle swarm algorithm and neural network
CN109271374B (en) Database health degree scoring method and system based on machine learning
CN110428876B (en) Steel material design method based on machine learning algorithm of physical guidance
CN111311401A (en) Financial default probability prediction model based on LightGBM
CN112446597B (en) Storage tank quality assessment method, storage tank quality assessment system, storage medium, computer equipment and application
CN112686296B (en) Octane loss value prediction method based on particle swarm optimization random forest parameters
CN112687349A (en) Construction method of model for reducing octane number loss
CN114093515A (en) Age prediction method based on intestinal flora prediction model ensemble learning
CN115188429A (en) Catalytic cracking unit key index modeling method integrating time sequence feature extraction
CN114187120A (en) Vehicle insurance claim settlement fraud risk identification method and device
CN111754317A (en) Financial investment data evaluation method and system
CN113256409A (en) Bank retail customer attrition prediction method based on machine learning
CN106951728B (en) Tumor key gene identification method based on particle swarm optimization and scoring criterion
CN112420132A (en) Product quality optimization control method in gasoline catalytic cracking process
CN110084301B (en) Hidden Markov model-based multi-working-condition process working condition identification method
CN114239400A (en) Multi-working-condition process self-adaptive soft measurement modeling method based on local double-weighted probability hidden variable regression model
CN111507824A (en) Wind control model mold-entering variable minimum entropy box separation method
CN110533249B (en) Metallurgical enterprise energy consumption prediction method based on integrated long-term and short-term memory network
CN112464554A (en) Operating parameter optimization method of gasoline refining equipment
CN111738870A (en) Method and platform for identifying insurance risk of engineering performance guarantee based on characteristic engineering
CN112342050B (en) Method and device for optimizing light oil yield of catalytic cracking unit and storage medium
Lööv Comparison of undersampling methods for prediction of casting defects based on process parameters
CN113077271A (en) Enterprise credit rating method and device based on BP neural network
CN112488188A (en) Feature selection method based on deep reinforcement learning
CN110389948A (en) A kind of tail oil prediction technique of the hydrocracking unit based on data-driven

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination