CN104915522B

CN104915522B - The hybrid modeling method and system of cohesive process priori and data-driven model

Info

Publication number: CN104915522B
Application number: CN201510376700.5A
Authority: CN
Inventors: 李绍军; 成祥; 杨一航; 许文夕; 郑文静
Original assignee: East China University of Science and Technology
Current assignee: East China University of Science and Technology
Priority date: 2015-07-01
Filing date: 2015-07-01
Publication date: 2019-06-25
Anticipated expiration: 2035-07-01
Also published as: CN104915522A

Abstract

Present invention discloses the hybrid modeling methods and system of a kind of cohesive process priori and data-driven model, the described method includes: data-driven model and suitable model structure is selected to establish the mathematical relationship expression formula of corresponding model from known data-driven model, all model parameters are arranged in a certain order；Model process priori is verified, the constraint equation that testing model violates process priori degree is obtained；Model output and observation by sample are compared, and the optimization aim equation of testing model fitting training sample degree is established；Constraint equation and optimization aim equation are combined, constrained optimization problem is constructed, optimized parameter solution is solved using constraint processing intelligent algorithm；The optimized parameter solution that solution is obtained substitutes into master mould as the model parameter solution of S1, is used for model prediction or model optimization.The present invention can more be met the model of priori knowledge under the neural metwork training of low volume data sample, avoid the generation of over-fitting.

Description

The hybrid modeling method and system of cohesive process priori and data-driven model

Technical field

The invention belongs to modeling chemical engineering processes fields, are related to a kind of process modeling approach more particularly to a kind of cohesive process The hybrid modeling method of priori and data-driven model；Meanwhile the invention further relates to a kind of cohesive process priori and data-drivens The mixed model of model.

Background technique

Traditional mechanistic information modeling method must have enough understandings to the system of modeling, and pass through mass-energy side The formula such as journey, reaction kinetics equation establish accurate descriptive model.But actual industrial process complexity is high, reaction mechanism It is more, it is difficult to accurately hold, mechanistic information modeling is caused often to be difficult to reach the required precision of procedures system model.Based on data-driven Method rapidly develop, and be widely used in modeling chemical engineering processes and optimization.The advantages of data-driven modeling method be only according to It holds in the palm in historical data or live real time data, does not need specific mechanistic information.For complicated non-linear process, can obtain It takes under sufficient input and output sample, data-driven modeling is often with there is preferable effect.However, data-driven modeling method also has Obvious disadvantage, if what mechanism was expressed obscures, epitaxial nature is bad, and common over-fitting in Small samples modeling.

The current method for mainly avoiding over-fitting mainly utilizes sample self information, and (such as external inspection method, guidance are adopted again Sample and noise injection technique exptended sample method) certain restriction of the realization to over-fitting, but research shows that Small samples modeling is made With limited.There are many mechanistic informations to be difficult to apply for process object itself, such as largely can analyze in the process Single order information (showing as monotonicity), second order information (showing as concavity and convexity) and export-restriction information.Some scholars are it is also proposed that comprehensive The data-driven modeling method of process mechanism is closed (such as the neural network of agent model method, output gain detection with mechanism Model, the modeling method for closing Multilayer networks and dull neural network), but comprehensive modeling method is there is also some defects, Model structure is too simple or modeling object constraint is higher.

The modeling method of traditional mechanistic information combination sample information have the defects that on overcoming overfitting problem it is many, It is easy to give up part useful information.In view of this, nowadays there is an urgent need to design a kind of new modeling method and model, it is comprehensive It closes and utilizes mechanistic information and sample information, the overfitting problem of customer service Small samples modeling.

Summary of the invention

The technical problems to be solved by the present invention are: the mixing for providing a kind of cohesive process priori and data-driven model is built Mould method can more be met the model of priori knowledge under the neural metwork training of low volume data sample, avoid over-fitting The generation of phenomenon.

In addition, the present invention also provides the hybrid system of a kind of cohesive process priori and data-driven model, it can be a small amount of Training is more met the data-driven model of priori knowledge in the case of data sample, is avoided the generation of over-fitting and is mentioned High model robustness.

In order to solve the above technical problems, the present invention adopts the following technical scheme:

A kind of hybrid modeling method of cohesive process priori and data-driven model, described method includes following steps:

Step S1, suitable data-driven model and suitable model structure are selected from known data-driven model, The mathematical relationship expression formula for establishing corresponding model, all model parameters are arranged in a certain order；

Step S2, to model process priori, (main include the single order of output response information, output response to step scan method surely Information (showing as monotonicity) and second order information (showing as concavity and convexity) are verified, and are obtained testing model and are violated process priori journey The constraint equation of degree；

Step S3, the model output by sample and observation compare, and establish inspection according to mean square deviation formula and regularization method Test the optimization aim equation of models fitting training sample degree；

Step S4, constraint equation and optimization aim equation are combined, constructs constrained optimization problem, using constraint processing intelligence Algorithm solves optimized parameter solution；

Step S5, the optimized parameter solution for obtaining solution substitutes into master mould as the model parameter solution of S1, is used for mould Type prediction or model optimization.

As a preferred solution of the present invention, the model mathematics relational expression of the step S1 and model parameter row Column: a certain data-driven model is selected, general we often select BP neural network model, Responsive surface model, supporting vector Machine etc..

For BP neural network model, mode input output relation formula are as follows:

Wherein:

Par is model parameter arrangement, is a kind of stationary arrangement of all weights and threshold value, including institute in BP network model Some w_ij, T_li, θ_jAnd θ_l,

X is input vector combination, may also indicate that into (x₁,x₂,…,x_L),

w_ijFor the weight coefficient of input layer to hidden layer,

θ_jFor the threshold coefficient of each neuron of hidden layer,

T_liFor the weight coefficient of hidden layer to output layer,

θ_lFor the threshold coefficient of output layer neuron,

L is input layer number, is determined by the input variable of real system,

H is hidden layer neuron number, rule of thumb or trial and error procedure does not know method and determines for modeler,

B is output layer neuron number, and since research object is multiple input single output object, output layer neuron number takes 1,

f_i-hFor hidden layer activation primitive, activation primitive is many kinds of, and simple linear function (purelin) has sigmoid Function (tansig and logsig), SIN function, tanh and arc tangent, Adaptive Polynomial function etc..Commonly include Tri- kinds of tansig, logsig and purelin, general hidden layer chooses tansig or logsig,

f_h-oFor hidden layer activation primitive, generally selection purelin；

Similar, for Responsive surface model, mode input output relation formula are as follows:

Wherein:

Par indicates model parameter combination, includes b₀、b_i、b_jkAnd b_l,

b₀For constant term,

b_iTo input x_iSingle order item,

b_jkTo input x_jAnd x_kPhase relation it is several,

b_lTo input x_lSecond order term,

x_iFor i-th of input variable, amount to L dimension；

As a preferred solution of the present invention, the step S2 obtains the constraint side of model by following 3 sub-steps Journey expression formula:

Step S2.1, point sampling is detected: in entire space fixed step size Δ_ijEntire space is carried out to be divided into multiple sons Space samples in subspace respectively, the repeated detection as output area priori.Likewise, with fixed step size Δ is adopted admittedly_ijIt will The interval division of the monotonicity priori and concavity and convexity priori known is labeled as at a smaller subinterval

Step S2.2, the priori detection of sampled point: the output area in the entire object input space is detected, can be passed through The output valve of sampled point obtains compared with exporting bound, it is known that the output upper limit is y_H, lower limit y_L, the output of comparing check model ValueWith their relationship, and if only ifWhen, output area priori is correct, otherwise violates in the zonule Output area priori.If all sampled points meet output area inspection result, model is approximately considered to the dull priori of this Without violation.Dull information correctness inspection in the minizone can only be solved by repeatedly fixing other input variable values Section two-end-point (With) output valve, compare size checklist tonality information, if output valve comparison result and dull believing Breath meets, and is approximately considered the minizone and meets dull information.For example, must have if the section is monotonic increase section If monotone decreasing section, must haveIf all minizones meet dull information inspection result, it is approximately considered mould Type is to the dull priori of this without violation.Similarly, examine the concavity and convexity of the minizone can be used two-end-point (With) and midpoint () output valve compare, the concave-convex information of approximate test minizone.For example, must have if the section is Strict Convex sectionIf between stringent recessed area, must haveAvailable symbols function representation each The inspection result each time of priori, such as formula (3), formula (4) and formula (5), formula (3) is for examining output area, formula (4) for examining monotonicity priori, formula (5) is for examining concavity and convexity priori；

Wherein, h (i, j, m) is the m times testing result of priori known to j-th strip in i-th dimension；

If each corresponding value of h (i, j, m) is " 1 ", then it is approximately considered model and complies fully with corresponding priori；h(i, J, m) it is " -1 ", model necessarily violates corresponding priori knowledge in corresponding minizone；

Step S2.3, it violates degree statistics: accounting for the ratio viol of all scanning inspection numbers according to the number of statistics " -1 ", The priori mechanism for portraying model violates degree, and calculation formula is as follows:

If system model complies fully with all process priori, viol value is 0.The smaller representative model minor transgression of viol value Known process priori；Viol indicates that the degree of model violation process is bigger closer to 1.

As a preferred solution of the present invention, the step S3 obtains the optimization mesh of model by following 2 sub-steps Scalar functions F (W):

Step S3.1, the mean square deviation model for constructing data-driven model output valve and measured value, is shown in formula (7):

For the model output of k-th of sample, calculation formula reference model relationship, if such as selection BP net Then reference formula (1) if Response to selection surface model then selects formula (2) selects other data-driven modeling methods then to network Corresponding model relationship is established,

y_l,kFor the observation of k-th of sample, with reference to the sampled output value of training sample,

X is input variable combination, can be expressed as (x₁,x₂,…,x_L), L is the dimension of input variable；

Step S3.2, regularization term is added on original mean square deviation item, and takes suitable regularization coefficient, building optimization mesh Regularization term is added in objective function in scalar functions (reference formula (8)), model can be made to obtain relatively gentle output response bent Line makes model preferably be fitted training sample, while having stronger robustness and predictive ability:

F (par)=E_D+ηE_w (8)

E_wRegularization term calculates the quadratic sum of network unknown parameter, such as BP network model calculates Responsive surface model calculates

η is regularization coefficient, and regularization coefficient weighs the smoothness of model, general value 0.001,

F (par) is the training objective function with unknown model parameters；

As a preferred solution of the present invention, the step S4 is specifically included: the constraint function and target of composite construction Majorized function constitutes constrained optimization problem, such as formula (9).The intelligent optimization algorithm that can handle constraint is substituted into, solution meets priori The optimal solution of the objective function of constraint, optimal solution are model parameter optimum combination:

Min F (par)=E_D+ηE_w

(8)

S.t. voil=0

The Deterministic Methods (feasible direction method etc.) of processing constrained optimization problem are easily trapped into local optimum, using belt restraining Intelligent algorithm handle equality constraint, such as introduce the μ AEA algorithm with adaptive loose constraint processing method, the algorithm have compared with Good equality constraint processing capacity；

As a preferred solution of the present invention, the step S5 is specifically included: the parameter combination par generation that solution is obtained In the data-driven model for entering S1 selection, the model for determining parameter is formed, BP network model is selected to substitute into formula (1), selection is rung Surface model is answered to substitute into formula (2).According to the model for determining parameter, substituting into new samples is the predictive ability that can verify that model；

Step S1, suitable data-driven model and suitable model structure are selected from known data-driven model, The mathematical relationship expression formula for establishing corresponding model, all model parameters are arranged in a certain order,

Step S2, to model process priori, (main include the single order of output response information, output response to step scan method surely Information (showing as monotonicity) and second order information (showing as concavity and convexity) are verified, and are obtained testing model and are violated process priori journey The constraint equation of degree,

Step S3, the model output by sample and observation compare, and establish inspection according to mean square deviation formula and regularization method The optimization aim equation of models fitting training sample degree is tested,

Step S4, constraint equation and optimization aim equation are combined, constructs constrained optimization problem, using constraint processing intelligence Algorithm solves optimized parameter solution,

Step S5, the optimized parameter solution for obtaining solution, the parametric solution of the model as S1 selection, substitutes into master mould, For model prediction or model optimization；

A kind of hybrid modeling system of cohesive process priori and data-driven model, the system comprises:

Training sample set and priori knowledge obtain module, and objective function is constructed when being modeled；

Data-driven model selecting module, structure mathematics function expression: if selection BP network model, it is appropriate to select Activation primitive and hidden layer neuron, BP network architecture is determined, if Response to selection surface model, according to formula (2) Combination；

Constraint function constructing module, according to the constraint violation frequency for determining step method of inspection statistics priori, structure constraint function is used Using the constraint as constrained optimization problem；

Optimization object function constructing module substitutes into the training objective that training sample construction combines mean square deviation item and regularization term Function, to the optimization aim as constrained optimization problem；

Constrained optimization problem solver module solves constrained optimization problem to the intelligent optimization algorithm by that can handle constraint Optimized parameter solution；

Parameter back-substitution module will solve parameter combination mode back substitution of the obtained higher-dimension parameter according to Construction of A Model when and enter mould Type obtains the model for determining structure and parameter.Modeling process terminates, model can be used for new samples output prediction or object it is defeated Enter optimization.

The beneficial effects of the present invention are: the hybrid modeling of cohesive process priori and data-driven model proposed by the present invention Method and system, can in the case where low volume data sample under more met the data-driven model of priori knowledge, avoid The generation of over-fitting.

The present invention from a completely new visual angle --- cohesive process prior information and sample data information architecture mixed model, The accurate modeling to modeling object is realized, modeling overfitting problem is overcome.Under the premise of guaranteeing model validation, the party Method is more in line with process elder generation compared to traditional data-driven modeling method (such as BP neural network model, Responsive surface model) It tests, and there is preferable predictive ability and robustness.

More accurate data-driven modeling is realized present invention introduces process priori.Hybrid modeling is the one of rising in recent years Kind modeling approach is widely used in all kinds of chemical processes predictions and optimization field, but its fusion method is always a great problem.Due to Data-driven modeling method can preferably be fitted sample ability to be fitted sample as target, and there are fixed input and output to close System, can be with constructing variable majorized function；Meanwhile process mechanism information has certain guidance and differentiation effect to model response, It can be used for tectonic model constraint.In addition, according to the optimization object function and constraint function of construction, at constraint of good performance Reason can only algorithm, realize solution to model parameter.The invention can not only avoid the overfitting problem of Small samples modeling, and It can guarantee the correctness of model mechanistic information, while also can guarantee that a certain extent model has good robustness.

Detailed description of the invention

Fig. 1 nanofiltration membrane be averaged colloidal sol radius sample model predication value and actual observed value comparison；

The hydrogen rate of feed of Fig. 2 one-stage hydrogenation reactor, one section of reactor tank import and export the temperature difference to the sound of acetylene exit concentration Ying Tu；

Fig. 3 is the modeling method of the invention flow chart.

Specific embodiment

The preferred embodiment that the invention will now be described in detail with reference to the accompanying drawings.

Embodiment one

Referring to Fig. 3, the invention proposes the hybrid modeling method of a kind of cohesive process priori and data-driven model, tool Steps are as follows for body:

[step S1] selects suitable data-driven model and suitable model structure from known data-driven model, The mathematical relationship expression formula for establishing corresponding model, all model parameters are arranged in a certain order.

The model mathematics relational expression of step S1Par is arranged with model parameter: selecting a certain data-driven Model, general we often select BP neural network model, Responsive surface model, support vector machines etc..

For BP neural network model, mode input output relation formula are as follows:

Wherein:

X is input vector combination, may also indicate that into (x₁,x₂,…,x_L),

w_ijFor the weight coefficient of input layer to hidden layer,

θ_jFor the threshold coefficient of each neuron of hidden layer,

T_liFor the weight coefficient of hidden layer to output layer,

θ_lFor the threshold coefficient of output layer neuron,

L is input layer number, is determined by the input variable of real system,

f_h-oFor hidden layer activation primitive, generally selection purelin；

Wherein:

Par indicates model parameter combination, includes b₀、b_i、b_jkAnd b_l,

b₀For constant term,

b_iTo input x_iSingle order item,

b_jkTo input x_jAnd x_kPhase relation it is several,

b_lTo input x_lSecond order term,

x_iFor i-th of input variable, amount to L dimension；

[step S2] walk surely scan method to model process priori (it is main including output response information, output response one Rank information (showing as monotonicity) and second order information (showing as concavity and convexity) are verified, and are obtained testing model and are violated process priori The constraint equation of degree；

Step S2 obtains the constraint equation expression formula of model by following 3 sub-steps:

Step S2.2, the priori detection of sampled point: the output area in the entire object input space is detected, can be passed through The output valve of sampled point obtains compared with exporting bound, it is known that the output upper limit is y_H, lower limit y_L, the output of comparing check model ValueWith their relationship, and if only ifWhen, output area priori is correct, otherwise violates in the zonule Output area priori.If all sampled points meet output area inspection result, model is approximately considered to the dull priori of this Without violation.Dull information correctness inspection in the minizone can only be solved by repeatedly fixing other input variable values Section two-end-point (With) output valve, compare size checklist tonality information, if output valve comparison result and dull believing Breath meets, and is approximately considered the minizone and meets dull information.For example, must have if the section is monotonic increase section If monotone decreasing section, must haveIf all minizones meet dull information inspection result, it is approximately considered Model is to the dull priori of this without violation.Similarly, examine the concavity and convexity of the minizone can be used two-end-point (With) and in Point () output valve compare, the concave-convex information of approximate test minizone.For example, must have if the section is Strict Convex sectionIf between stringent recessed area, must haveAvailable symbols function representation each The inspection result each time of priori, such as formula (3), formula (4) and formula (5), formula (3) is for examining output area, formula (4) for examining monotonicity priori, formula (5) is for examining concavity and convexity priori；

The model output and observation of [step S3] by sample are compared, and establish inspection according to mean square deviation formula and regularization method Test the optimization aim equation of models fitting training sample degree；

Step S3 obtains the optimization object function F (W) of model by following 2 sub-steps:

F (par)=E_D+ηE_w (8)

F (par) is the training objective function with unknown model parameters；

[step S4] combines constraint equation and optimization aim equation, constructs constrained optimization problem, once using constraint processing Algorithm solves optimized parameter solution；

Step S4 is specifically included: the constraint function and objective optimization function of composite construction, constitutes constrained optimization problem, such as public Formula (9).The intelligent optimization algorithm that can handle constraint is substituted into, the optimal solution for meeting prior-constrained objective function, optimal solution are solved It is model parameter optimum combination:

Min F (par)=E_D+ηE_w

(9)

S.t. voil=0

The optimized parameter solution that [step S5] obtains solution substitutes into master mould as the model parameter solution of S1, is used for mould Type prediction or model optimization.

Step S5 is specifically included: obtained parameter combination par will be solved and substituted into the data-driven model of S1 selection, composition It determines the model of parameter, BP network model is selected to substitute into formula (1), Response to selection surface model substitutes into formula (2).According to determination The model of parameter, substituting into new samples is the predictive ability that can verify that model；

Embodiment two

It will be helpful to understand the present invention by the following description of the embodiments, but be not intended to limit the contents of the present invention.It please refers to Fig. 2, the present embodiment are realized based on the nanofiltration film sol average grain diameter under hydrolysis temperature, glycerol additive amount and complexing agent additive amount Estimation, influence of the input variable to aerosol particle size are as shown in table 1.The nanofiltration film sol preparation process of the present embodiment research is stable state Process, process data pick up from 46 groups of samples (long (month left side of aerosol particle size stability data acquisition cycle that experiment obtains It is right), great amount of samples can not be obtained).Other variable parameters are as follows: presoma molar ratio is Zr:Ti=4, and presoma and water rub You test input variable range are as follows: hydrolysis temperature: 50 than being 1:555, collosol concentration 0.1mol/L, and a length of 1h when complexing ~90 °C, glycerol and presoma molar ratio 0~1.2, the molar ratio 3~8 of complexing agent.46 groups of samples are divided into 2 parts, it is random to select 35 groups are selected as training sample, 11 groups are used as test sample.BP neural network model is selected, trial and error procedure determines BP network model knot Structure is more appropriate when being 3-5-1 model, and reference formula 1 establishes relationship.

Table 1: hydrolysis temperature, glycerol additive amount and complexing agent additive amount are averaged the influence of colloidal sol radius to nanofiltration membrane

(1) according to formula (1), the BP network Model under " 3-5-1 " structure is constructedInclude weight 20 A and threshold value 6, final optimization pass objective function is 26 dimensions；

(2) according to prior information, suitable step-length is selected, using fixed step size, is tested to the priori knowledge of table 1, The violation percentage of time that will test is as constraint function voil；

(3) according to BP network Model and training sample observation, mean square deviation item E is constructed_D；

(4) regularization term E is added under the mean square error item of construction_w, regularization coefficient η, which passes through, to be tried to gather, 0.001 is chosen, It constitutes optimization object function F (par), includes 26 unknown parameters in F (par)；

(5) constraint function of composite construction and objective optimization function constitute constrained optimization problem, substitute into constraint processing intelligence Algorithm (this example chooses uAEA algorithm), solves the optimal solution for meeting prior-constrained objective function；

(6) it solves obtained parameter combination par to substitute into model, forms determining BP network model；

(7) substituting into test sample new samples can verify that the predictive ability of model, sample true value and model predication value as schemed 1, first 35 groups are training sample, and latter 11 groups are test sample；

(8) mean square deviation for calculating training sample set is 0.1085, and the mean square deviation of test sample collection is 0.1899.

The result shows that being applied to using the hybrid modeling method of cohesive process priori and data-driven model of the invention The model that the average aerosol particle size of nanofiltration membrane is estimated has preferable prediction effect, is that effectively, can be used for other The partial size of sample is estimated.Method proposed by the present invention can effectively avoid the overfitting problem of Small samples modeling.It is worth noting that, Since BP network introduces process priori, method proposed by the present invention is not only upper error free in priori expression, and can show compared with Good system robustness.

Embodiment three

It will be helpful to understand the present invention by the following description of the embodiments, but be not intended to limit the contents of the present invention.This example Analysis object be ethylene producing device acetylene hydrogenation reactor one section of reaction kettle (coloured circle in such as Fig. 3).It selects herein C2 fraction rate of feed, the easy measurement of hydrogen rate of feed and the one section of reactor tank inlet and outlet temperature difference three variable as hard measurement (independent variable) establishes soft-sensing model and estimates one section of reactor tank acetylene exit concentration.

One section of reactor tank acetylene hard measurement of the present embodiment research, process data pick up from 40 groups of samples that experiment obtains.It will 40 groups of samples are divided into 2 parts, and 30 groups of random selection are used as training sample, and 10 groups are used as test sample.Select BP neural network mould Type, trial and error procedure determine BP network architecture be 3-4-1 model when it is more appropriate, reference formula 1 establishes relationship.

According to reaction mechanism, the priori knowledge of the present embodiment is as shown in the table:

The shadow of table 2:C2 fraction rate of feed, hydrogen rate of feed and one section of reactor tank inlet and outlet temperature difference to acetylene exit concentration It rings

(1) according to formula (1), the BP network Model under " 3-4-1 " structure is constructedInclude weight 16 A and threshold value 5, final optimization pass objective function is 21 dimensions；

(2) according to prior information, suitable step-length is selected, using fixed step size, is tested to the priori knowledge of table 2, The violation percentage of time that will test is as constraint function voil；

(4) regularization term E is added under the mean square error item of construction_w, regularization coefficient η, which passes through, to be tried to gather, 0.001 is chosen, It constitutes optimization object function F (par), includes 21 unknown parameters in F (par)；

(7) substituting into test sample new samples can verify that the predictive ability of model, the mean square deviation for calculating training sample set are 7.173×10^-4, the mean square deviation of test sample collection is 5.253 × 10^-4.Simultaneously make respectively when C2 fraction inlet amount be in it is low, Corresponding hydrogen rate of feed, one section of reactor tank when middle and high three values import and export the temperature difference to the response diagram of acetylene exit concentration, such as Fig. 2.

The result shows that being applied to using the hybrid modeling method of cohesive process priori and data-driven model of the invention One section of reactor tank concentration of acetylene hard measurement has preferable prediction effect, is that effectively, can be used for predicting the second under other inputs Alkynes exit concentration.Method proposed by the present invention can effectively avoid the overfitting problem of Small samples modeling.Significantly, since BP network introduces process priori, and method proposed by the present invention is not only upper error free in priori expression, but also can show preferable System robustness (Fig. 2).

Example IV

The present invention also discloses the mixed model of a kind of cohesive process priori and data-driven model, the system comprises: instruction Practice sample set and priori knowledge obtains module, data-driven model selecting module, constraint function constructing module, optimization object function Constructing module, constrained optimization problem solver module and parameter back-substitution module.

The specific implementation of modules sees the corresponding realization process of each step in embodiment one.

In conclusion the mixed method and system of cohesive process priori proposed by the present invention and data-driven model, can gram The overfitting problem of traditional data driving modeling is taken, so that final mask, is not only consistent with process priori, but also is had good Robustness, be able to achieve and the object of known certain mechanistic information more accurately modeled.

The present invention from a completely new visual angle --- cohesive process prior information and sample data information architecture mixed model, The accurate modeling to modeling object is realized, modeling overfitting problem is overcome.Under the premise of guaranteeing model validation, the party Method compared to traditional data-driven modeling method (such as BP neural network modeling, Response Surface Method, support vector machines) more Meet process priori, and there is preferable predictive ability and robustness.

More accurate data-driven modeling is realized present invention introduces process priori.Hybrid modeling is the one of rising in recent years Kind modeling approach is widely used in all kinds of chemical processes predictions and optimization field, but its fusion method is always a great problem.Due to Data-driven modeling possesses the ability of fitting sample, has fixed input/output relation, can be with constructing variable majorized function；Together When, process mechanism information to model response have it is certain know and differentiation effect, can be used for tectonic model and constrain.In addition, root According to the optimization object function and constraint function of construction, evolution algorithm is handled using constraint of good performance, is realized to model parameter Solution, which can not only avoid the overfitting problem of Small samples modeling, and can guarantee the correct of model mechanistic information Property, while also can guarantee that a certain extent model has good robustness.

Description and application of the invention herein are illustrative, is not wishing to limit the scope of the invention to above-described embodiment In.The deformation and change of embodiments disclosed herein are possible, the realities for those skilled in the art The replacement and equivalent various parts for applying example are well known.It should be appreciated by the person skilled in the art that not departing from the present invention Spirit or essential characteristics in the case where, the present invention can in other forms, structure, arrangement, ratio, and with other components, Material and component are realized.Without departing from the scope and spirit of the present invention, can to embodiments disclosed herein into The other deformations of row and change.

Claims

1. the chemical process key index flexible measurement method of a kind of cohesive process priori and data-driven model, which is characterized in that The accuracy and precision that data-driven model is improved using process priori and mechanistic information, improve model prediction ability and robust Property, avoid over-fitting；Mechanistic information mainly includes output response information, output response order information；The method includes Following steps:

Step S1, suitable data-driven model and suitable model structure are selected from known data-driven model, are established The mathematical relationship expression formula of corresponding model, all model parameters are arranged in a certain order；

Wherein, model mathematics relational expressionPar is arranged with model parameter: selecting a certain data-driven model, is selected Select one or more of BP neural network model, Responsive surface model, support vector machines；

For BP neural network model, mode input output relation formula are as follows:

Wherein:

Par is model parameter arrangement, is a kind of stationary arrangement of all weights and threshold value in BP network model, including all w_ij, T_li, θ_jAnd θ_l；

X is input vector combination, may also indicate that into (x₁,x₂,…,x_L),

w_ijFor the weight coefficient of input layer to hidden layer,

θ_jFor the threshold coefficient of each neuron of hidden layer,

T_liFor the weight coefficient of hidden layer to output layer,

θ_lFor the threshold coefficient of output layer neuron,

L is input layer number, is determined by the input variable of real system,

f_i-hFor hidden layer activation primitive, activation primitive is many kinds of, simple linear function purelin, sigmoid function, sine Function, tanh and arc tangent, Adaptive Polynomial function；Sigmoid function includes tansig and logsig；Hidden layer swashs Function f living_i-hChoose tansig or logsig；

f_h-oFor hidden layer activation primitive, purelin is chosen；

Step S2, step scan method verifies model process priori surely, obtains testing model and violates process priori degree Constraint equation；Model process priori mainly include output response information, the single order information for showing as monotonicity of output response and Show as the second order information of concavity and convexity；

Step S3, the model output by sample and observation compare, and are established according to mean square deviation formula and regularization method and examine mould The optimization aim equation of type fitting training sample degree；

Step S4, constraint equation and optimization aim equation are combined, constructs constrained optimization problem, intelligent algorithm is handled using constraint Solve optimized parameter solution；

Step S5, the optimized parameter solution for obtaining solution substitutes into master mould as the model parameter solution of step S1, is used for mould Type prediction or model optimization；

The analysis object of the method is one section of reaction kettle of the acetylene hydrogenation reactor of ethylene producing device；C2 fraction is selected Rate of feed, the easy measurement of hydrogen rate of feed and one section of reactor tank inlet and outlet three variable of the temperature difference as hard measurement, as Independent variable establishes soft-sensing model and estimates one section of reactor tank acetylene exit concentration；

One section of reactor tank acetylene hard measurement of research, process data pick up from 40 groups of samples that experiment obtains；40 groups of samples are divided into 2 Part, 30 groups of random selection are used as training sample, and 10 groups are used as test sample；BP neural network model is selected, trial and error procedure determines BP network architecture is more appropriate when being 3-4-1 model, and reference formula (1) establishes relationship；It specifically includes:

(1) according to formula (1), the BP network Model under " 3-4-1 " structure is constructedComprising weight 16 and Threshold value 5, final optimization pass objective function is 21 dimensions；

(2) according to prior information, suitable step-length is selected, using fixed step size, is tested to priori knowledge, will test to obtain Violation percentage of time as constraint function voil；

(4) regularization term E is added under the mean square error item of construction_w, regularization coefficient η, which passes through, to be tried to gather, and is chosen 0.001, is constituted excellent Change objective function F (par), includes 21 unknown parameters in F (par)；

(5) constraint function of composite construction and objective optimization function constitute constrained optimization problem, substitute into constraint processing intelligence and calculate Method solves the optimal solution for meeting prior-constrained objective function；

(7) substituting into test sample new samples can verify that the predictive ability of model, the mean square deviation for calculating training sample set is 7.173 × 10^-4, the mean square deviation of test sample collection is 5.253 × 10^-4；It is made respectively simultaneously when C2 fraction inlet amount is in basic, normal, high by three Corresponding hydrogen rate of feed, one section of reactor tank when a value import and export the temperature difference to the response diagram of acetylene exit concentration.

2. the chemical process key index hard measurement side of cohesive process priori according to claim 1 and data-driven model Method, it is characterised in that:

For Responsive surface model, mode input output relation formula are as follows:

Wherein:

Par indicates that model parameter arrangement includes b₀、b_i、b_jkAnd b_l,

b₀For constant term,

b_iTo input x_iSingle order item,

b_jkTo input x_jAnd x_kPhase relation it is several,

b_lTo input x_lSecond order term,

x_iFor i-th of input variable, amount to L dimension.

3. the chemical process key index hard measurement side of cohesive process priori according to claim 1 and data-driven model Method, it is characterised in that:

The step S2 obtains the constraint equation expression formula of model by following 3 sub-steps:

Step S2.1, test point sampling step: in entire space fixed step size Δ_ijEntire space is carried out to be divided into multiple sons Space samples in subspace respectively, the repeated detection as output area priori；Likewise, with fixed step size Δ_ijIt will be known Monotonicity priori and concavity and convexity priori sectionIt is divided into a smallerSubinterval is labeled asWherein

Step S2.2, the priori detecting step of sampled point: the output area in the entire object input space is detected, can be passed through The output valve of sampled point obtains compared with exporting bound, it is known that the output upper limit is y_H, lower limit y_L, the output of comparing check model ValueWith their relationship, and if only ifWhen, output area priori is correct, otherwise violates in the zonule Output area priori；If all sampled points meet output area inspection result, model is approximately considered to the dull priori of this Without violation；Dull information correctness inspection in the minizone can only be solved by repeatedly fixing other input variable values Section two-end-pointWithOutput valve, compare size checklist tonality information, if output valve comparison result and dull information Meet, is approximately considered the minizone and meets dull information；If the section is monotonic increase section, must haveIf list Tune successively decreases section, must haveIf all minizones meet dull information inspection result, model is approximately considered to this Item dullness priori is without violation；Similarly, examine the concavity and convexity of the minizone that two-end-point can be usedWithThe midpoint andIt is defeated Value compares out, the concave-convex information of approximate test minizone；If the section is Strict Convex section, must have If between stringent recessed area, must haveThe inspection knot each time of available symbols function representation each priori Fruit, formula (3) is for examining output area, and formula (4) is for examining monotonicity priori, and formula (5) is for examining concavity and convexity first It tests；

If each corresponding value of h (i, j, m) is " 1 ", then it is approximately considered model and complies fully with corresponding priori；h(i,j,m) For " -1 ", model necessarily violates corresponding priori knowledge in corresponding minizone；

Step S2.3, it violates degree statistic procedure: accounting for the ratio viol of all scanning inspection numbers according to the number of statistics " -1 ", The priori mechanism for portraying model violates degree, and calculation formula is as follows:

If system model complies fully with all process priori, viol value is 0；Known to the smaller representative model minor transgression of viol value Process priori；Viol indicates that the degree of model violation process is bigger closer to 1.

4. the chemical process key index hard measurement side of cohesive process priori according to claim 2 and data-driven model Method, it is characterised in that:

The step S3 obtains the optimization object function F (W) of model by following 2 sub-steps:

For the model output of k-th of sample, calculation formula reference model relationship, if BP network is selected then to refer to Formula (1) selects other data-driven modeling methods then to establish correspondence if Response to selection surface model then selects formula (2) Model relationship,

Step S3.2, regularization term is added on original mean square deviation item, and takes suitable regularization coefficient, construct optimization aim letter It counts, regularization term is added in objective function, model can be made to obtain relatively gentle output response curve, keep model preferably quasi- Training sample is closed, while having stronger robustness and predictive ability:

F (par)=E_D+ηE_w (8)

E_wRegularization term, calculates the quadratic sum of network unknown parameter, and BP network model calculatesResponse surface design Model calculatesb_iTo input x_iSingle order item；

η is regularization coefficient, and regularization coefficient weighs the smoothness of model, value 0.001；

F (par) is the training objective function with unknown model parameters.

5. the chemical process key index hard measurement side of cohesive process priori according to claim 1 and data-driven model Method, it is characterised in that:

The step S4 is specifically included: the constraint function and objective optimization function of composite construction, constitutes constrained optimization problem, such as public Formula (9)；The intelligent optimization algorithm that can handle constraint is substituted into, the optimal solution for meeting prior-constrained objective function, optimal solution are solved It is model parameter optimum combination:

The Deterministic Methods of processing constrained optimization problem are easily trapped into local optimum, handle equation using the intelligent algorithm of belt restraining Constraint introduces the μ AEA algorithm with adaptive loose constraint processing method.

6. the chemical process key index hard measurement side of cohesive process priori according to claim 2 and data-driven model Method, it is characterised in that:

The step S5 is specifically included: obtained model parameter arrangement par will be solved and substituted into the data-driven model of S1 selection, The model for determining parameter is formed, BP network model is selected to substitute into formula (1), Response to selection surface model substitutes into formula (2)；According to Determine the model of parameter, substituting into new samples is the predictive ability that can verify that model.