Background
Complex systems exist in all parts of the world, such as industrial systems (process operations, factory production, supply chain management, etc.), ecosystems (human social activities, environmental changes, etc.). It contains a large number of units and subsystems, and there is a strong coupling between units and subsystems. Accurately describing all mechanisms within a complex system using mathematical models is an extremely difficult task. The data of the research object is analyzed, so that the description expressions of the process variables (input, output and intermediate variables) of the system are obtained, and the data drive approximate model modeling is carried out. Therefore, for such research objects, modeling using an approximate model is an optimal solution. The approximate model has the advantages of small calculation amount, and the calculation result is similar to that of the high-precision model. The approximate model method mainly comprises a Polynomial Response Surface (PRS) model, a Kriging model, a Radial Basis Function (RBF) model, a high-dimensional model (HDMR) method and the like.
(1) The Polynomial Response Surface (PRS) is the sum of the weights of a series of polynomial terms. After the order of the polynomial is selected, the PRS determines the coefficients of the polynomial through a group of sample points and the response thereof by using the principle of least square method. The PRS has the advantages of simple structure, high construction efficiency and suitability for approximation of low-dimensional low-order problems. But its disadvantages are also evident: (1) the PRS model is not suitable for the high-dimensional nonlinear problem; (2) the selection of the polynomial order in the PRS model depends on human experience, and for unknown engineering problems, a proper order is difficult to determine in advance; (3) the number of sample points required to construct the PRS model is large, especially in high-dimensional cases.
(2) Kriging is an interpolation model established on the basis of the theoretic analysis of the variogram. The variance is used for measuring the variation of a design space, and the error of a predicted value obtained by spatial distribution is ensured to be minimum, namely Kriging is an unbiased estimation model for estimating the minimum variance. The Kriging model has the advantages that: (1) the prediction precision of the method to the highly nonlinear problem is very high, and the predicted value is ensured to pass through a sample point; (2) it can select different correlation functions and regression functions to approximate different problems. The Kriging model has the following disadvantages: (1) the structure is complex, the construction efficiency is low, and the construction is easy to fail under the condition of more sample points or high dimension; (2) it is sensitive to noisy data as an interpolation model.
(3) The Radial Basis Function (RBF) model is represented as a linear sum of a series of basis functions with respect to a sample point. The basic function structure of the RBF model is the same, and the difference lies in the selection of the basis function. The RBF model has the advantages that: (1) the structure is simple, the precision is high, and particularly for high-order and high-dimensional problems, the performance of the model is superior to other approximate models. (2) The existing comparison test shows that the prediction precision of the RBF model is between PRS and Kriging, and the modeling time of the RBF is far shorter than that of the RBF. The disadvantages of the RBF model are: (1) the method is not suitable for processing linear problems and low-dimensional problems, and a polynomial expansion term needs to be added; (2) when the RBF model carries out unknown point estimation, the calculation amount is large, and a matched rapid estimation algorithm is needed.
(4) The high-dimensional model (HDMR) method provides a hierarchical decomposition structure whose main idea is to decompose a high-dimensional problem into a series of low-dimensional problems and sum up, thereby solving the difficulties caused by the high dimensionality. In a series of HDMR modeling methods, cut-HDMR is simple and easy to implement and is widely used. cut-HDMR develops the objective function as a superposition of functions on the cut line, cut surface and super-cut surface of the over-cut point. However, cut-HDMR only provides a function structure in the form of an index table, and has no complete function expression.
The existing approximate model technology, such as a Polynomial Response Surface (PRS) model, a Kriging model, a Radial Basis Function (RBF) model and the like, is suitable for the low-dimensional condition, and is not ideal in the (n >10) high-dimensional condition. With the increase of dimensionality, the number of sample points required for constructing an approximate model with reasonable precision is exponentially increased due to the existence of a dimensionality disaster, the calculation consumption is extremely large, and the method is easy to fail. Although the RBF model is considered to be suitable for high-dimensional and high-order problems, the RBF model is applied to many high-dimensional optimization algorithms. It is not an approximate model that directly addresses the high-dimensional problem. Today's engineering problems become more and more complex with tens or even hundreds of design variables.
Although the high-dimensional model (HDMR) method is proposed for the high-dimensional problem, it requires the computation of each order of component functions, and it is a common practice to construct an approximation function of the corresponding order to approximate it. When solving each order composition function in the HDMR expansion, the currently adopted method is also the traditional approximation method, including the moving least square method, Kriging interpolation, radial basis function, support vector regression, etc. These methods are based on a nonlinear optimization algorithm, and the relationship between modeling efficiency (calculation cost) and model fitting accuracy cannot be well balanced. Therefore, the application of the HDMR method is still limited by the high-dimensional problem of computational overhead.
Disclosure of Invention
For this reason, it is necessary to provide a new method, which is improved from a high-dimensional model (HDMR) solving algorithm, and develop a high-dimensional approximate model method with high efficiency and high precision for solving the modeling and optimization problems of a complex system.
In order to achieve the aim, the invention provides a complex system modeling optimization method, which comprises the steps of analyzing the behavior characteristics of an object system, selecting input and output variables, determining the value range of each input variable, and acquiring input and output data sets;
for each output variable, determining a high-dimensional approximate model thereof;
solving undetermined parameters in the high-dimensional approximate model to ensure that the error between the model predicted value and the input and output data sets is minimum within the error range of the set high-dimensional approximate model;
and optimizing the behavior characteristics of the object system based on the solved high-dimensional approximate model.
Preferably, the method further comprises the step of subject system data acquisition.
Specifically, the high-dimensional approximation model for each output variable y is:
in formula (1), N is the number of input variables x, K is the maximum order of the input variables x, i and i 'represent each specific variable x, K and K' represent the order of each variable x, and the model parameters include: C. a. thei,kAnd Bi,i′,k,k′Where C represents the zeroth order response to the output variable y; a. thei,kFinger input variable xiThe effect on the output variable y when acting alone; b isi,i′,k,k′Is an input variable xiAnd xi’The effect of the coupling on the output variable y.
Further, the undetermined parameters in the high-dimensional approximation model are solved, so that the error between the model prediction value and the input and output data sets is minimum within the error range of the set high-dimensional approximation model; the method specifically comprises the following steps:
establishing a linear optimization model between the output variable predicted value y and the input variable x based on a high-dimensional approximate model, wherein the subscript N represents each group of data, N is the number of the groups of data,
constraining an error range (σ) of the high-dimensional approximation model; introducing two variables (ya) not less than 0nAnd ybn),y* nIs a sample value of the output variable;
0≤yan≤σ,n∈N (5)
0≤ybn≤σ,n∈N (6)
establishing a target value r for linear optimization such that the error (ya) between the predicted value and the sample value of the datan+ybn) Minimum, as in formula (7);
an error range (sigma) is set, and the initial order K of the input variable (x) is 1.
Solving the established linear optimization model: judging whether the linear optimization model has a solution or not; if the solution exists, outputting a result, and stopping the algorithm; if no solution exists, the following steps are carried out:
increasing the order of an input variable x, and enabling K to be K + 1; and returning to solve the linear optimization problem after the variable x order is updated again.
A system modeling optimization storage medium, wherein the storage medium executes a step when being run, analyzes the behavior characteristics of an object system, selects input and output variables, determines the value range of each input variable, and acquires input and output data sets;
for each output variable, determining a high-dimensional approximate model thereof;
solving undetermined parameters in the high-dimensional approximate model to ensure that the error between the model predicted value and the input and output data sets is minimum within the error range of the set high-dimensional approximate model;
and optimizing the behavior characteristics of the object system based on the solved high-dimensional approximate model.
Preferably, the storage medium, when executed, further performs the step of subject system data acquisition.
Specifically, the high-dimensional approximation model for each output variable y is:
in formula (1), N is the number of input variables x, K is the maximum order of the input variables x, i and i 'represent each specific variable x, K and K' represent the order of each variable x, and the model parameters include: C. a. thei,kAnd Bi,i′,k,k′Where C represents the zeroth order response to the output variable y; a. thei,kFinger input variable xiThe effect on the output variable y when acting alone; b isi,i′,k,k′Is an input variable xiAnd xi’The effect of the coupling on the output variable y.
Further, when the storage medium is operated, the undetermined parameters in the high-dimensional approximation model are solved, so that the error between the model prediction value and the input and output data sets is minimum within the set error range of the high-dimensional approximation model; the method comprises the following specific steps:
establishing a linear optimization model between the output variable predicted value y and the input variable x based on a high-dimensional approximate model, wherein the subscript N represents each group of data, N is the number of the groups of data,
constraining an error range (σ) of the high-dimensional approximation model; introducing two variables (ya) not less than 0nAnd ybn),y* nIs a sample value of the output variable;
0≤yan≤σ,n∈N (5)
0≤ybn≤σ,n∈N (6)
establishing the purpose of linear optimizationScaling r, error (ya) of predicted value and data sample valuen+ybn) Minimum, as in formula (7);
an error range (sigma) is set, and the initial order K of the input variable (x) is 1.
Solving the established linear optimization model: judging whether the linear optimization model has a solution or not; if the solution exists, outputting a result, and stopping the algorithm; if no solution exists, the following steps are carried out:
increasing the order of an input variable x, and enabling K to be K + 1; and returning to solve the linear optimization problem after the variable x order is updated again.
Different from the prior art, the technical scheme can solve the problem of low efficiency of parameter fitting calculation of some systems in the production process in the prior art by providing a new modeling optimization method, and is preferentially suitable for parameter input and output fitting of complex systems. Among them, a complex system (complex system) is a system having a moderate number of intelligent, adaptive subjects that act based on local information. A complex system is a poorly defined system that exists in all corners of the world. Thus, the method is defined in the mathematical science to be distinguished from a simple system and a random system, and can be approximated to a chaotic system in the physical and chemical science. For example, the method is applied to the field of chemical reaction, and can better control the aspects of raw material proportion, conversion rate and the like.
Detailed Description
To explain technical contents, structural features, and objects and effects of the technical solutions in detail, the following detailed description is given with reference to the accompanying drawings in conjunction with the embodiments.
Referring to fig. 1, a system modeling and optimizing method based on a high-dimensional approximation model according to the present invention includes the following steps:
step S100: and (4) collecting system data, or constructing a system mechanism simulation scheme. Data is the key to constructing high-dimensional approximate models, and the sources of the data mainly include two ways: carrying out on-site actual data acquisition; or by constructing a system mechanism simulation scheme, generally adopting general simulation software, the simulation data with extremely high similarity to the actual behavior of the system is obtained.
Step S200: analyzing the behavior characteristics of the modeling object, selecting input and output variables, and determining the value range of each input variable. An "input-output" dataset is obtained from the actual data acquisition in step S100 or by using a mechanistic simulation software.
Step S300: for each output variable in step S200, the structure of its high-dimensional approximation model is determined. Each output variable (y) can be described as a simplified high-dimensional approximation model, as in equation (1):
in formula (1), N is the number of input variables x, K is the maximum order of the input variables x, subscripts i and i 'denote each specific variable x, and superscripts K and K' denote the order of each variable x. The model parameters include: C. a. thei,kAnd Bi,i′,k,k′Where C represents the zeroth order response to the output variable (y); a. thei,kFinger input variable xiThe effect on the output variable (y) when acting alone; b isi,i′,k,k′Is an input variable xiAnd xi’The effect of the coupling on the output variable (y).
Step S400: and (5) providing a linear optimization method, and solving the parameters of the high-dimensional approximation model in the step S300. The main thought of solving the high-dimensional approximate model parameters by the proposed linear optimization algorithm is as follows: within the set high-dimensional approximate model error range, the error between the model predicted value and the data sample value is minimum. The method comprises the following specific steps:
step S410: and establishing a calculation relation between the predicted value (y) of the output variable and the input variable x based on a high-dimensional approximate model (formula (1)), wherein a subscript N represents each group of data, N is the number of the groups of data, and other symbols can refer to the formula (1).
Step S420: the error range (σ) of the high-dimensional approximation model is constrained. Introducing two variables (ya) not less than 0
nAnd yb
n) By the expressions (3) to (6), inequalities including absolute values are expressed
And the optimization problem is solved with reduced difficulty by converting the optimization problem into a group of linear inequalities. y is
* nIs the sample value of the output variable.
0≤yan≤σ,n∈N (5)
0≤ybn≤σ,n∈N (6)
Step S430: establishing a target value (r) for linear optimization such that the error (ya) between the predicted value and the data sample valuen+ybn) And (5) the minimum is shown as formula (7).
Step S440: an error range (sigma) is set, and the initial order K of the input variable (x) is 1.
Step S450: and solving a linear optimization problem. Aiming at the linear optimization problem established in the steps S410-S440, the linear optimization problem can be solved efficiently by using a classical dual simplex algorithm by using a mathematical programming technology.
Step S460: and judging whether the linear optimization problem has a solution. If the solution exists, outputting the result, and stopping the algorithm; if there is no solution, the process proceeds to step S470.
Step S470: the order of the input variable (x) is increased, K ═ K + 1. Returning to the step S450, solving the linear optimization problem after the order of the variable (x) is updated. By repeatedly executing steps S450-S470 while increasing the order K of the variable (x), all parameters (A) of the high-dimensional approximation model within the error range (sigma) can be obtainedi,kAnd Bi,i′,k,k′)。
Step S500: based on the high-dimensional approximation model obtained by the solution in step S400, the behavior (i.e., output) of the system is predicted and optimized with high precision. Since the high-dimensional approximation model obtained by the solution in step S400 has high fitting accuracy to data, the high-dimensional approximation model can be used to predict the behavior (i.e., output) of the system under other input variables. Secondly, the high-dimensional approximation model (equation (1)) is a simpler nonlinear expression, and by using the expression, a nonlinear programming problem for optimizing the output variable (y) can be established. By using a mathematical programming technology and a sequential quadratic programming algorithm, the nonlinear optimization problem can be solved, and input variables of the system under an optimal target value can be obtained.
The feasibility and the advantages of the novel process are illustrated by the following specific examples.
A high-precision approximate model is built for a complex industrial process, and the process is optimized, such as the maximization of production profit or the maximization of product conversion rate. Fig. 3 is a process flow diagram of a certain chemical product production, which includes various production devices (pumps, heat exchangers, reactors, rectifying towers, compressors, etc.), involving a large number of physical and chemical reactions, and strong coupling effects exist among the production devices and among materials. It is an extremely complex and difficult task to accurately describe all mechanisms using mathematical models. The modeling method based on the high-dimensional approximate model can efficiently and accurately establish the approximate model of the process, optimize the process operation according to the obtained approximate model and realize the maximization of production profit or the highest product conversion rate.
Step S100: and (4) collecting system data, or constructing a system mechanism simulation scheme.
Because the actual production of the process is in a steady state for a long period of time, the data collected on site has little change. However, building a high-dimensional approximation model requires a range of sample points for the input variables and the output variables. Therefore, the mechanism simulation of the process needs to be carried out by means of chemical process simulation software Aspen Plus. And comparing the Aspen Plus simulation result with field data, so that the data fitting degree of the simulation process and the actual process is high. Thus, the Aspen Plus simulation process can be used to replace the actual process to generate data sample points within a certain production operating range.
Step S200: analyzing the behavior characteristics of the modeling object, selecting input and output variables, and determining the value range of each input variable. An "input-output" dataset is obtained from the actual data acquisition in step S100 or by using a mechanistic simulation software.
Influence the production profit (y) of the process1) And product conversion (y)2) The main input variables of (2) are 5, i.e. the raw material flow (x)1) Raw material to hydrogen ratio (x)2) Main rectifying tower pressure (x)3) Temperature of the main reactor (x)4) And pressure (x)5). The operating ranges for each variable are shown in table 1.
TABLE 1 operating ranges of the main variables affecting the process
By adopting a full factorial test design method, 5 data points are taken by each variable in an operation range in an average segmentation mode, and a 3125 group of input-output data sets are generated in total.
Step S300: for the 5 output variables in step S200, a high-dimensional approximate model structure of the output variables is determined.
Profit of process production (y)1) The high-dimensional approximation model of (2):
conversion of product of the scheme (y)2) The high-dimensional approximation model of (2):
step S400: and solving parameters of the high-dimensional approximation model in the step S300 by adopting the linear optimization method established in the steps S410-S470.
The computer is configured as follows: intel (R) core (TM) i7-6700CPU @3.40GHz and 8GB RAM. Within 10 seconds of the calculation time, the flow production profit (y) can be obtained1) Is determined by the parameters (A) of the high-dimensional approximation model1,i,kAnd B1,i,i′,k,k′) As shown in tables 2 and 3, fitting R of the approximate model2The value was 0.9993.
TABLE 2 production profits (y)1) Is determined by the parameters (A) of the high-dimensional approximation model1,i,k)
TABLE 3 production profits (y)1) Parameter (B) of the high-dimensional approximation model1,i,i′,k,k′)
Also within 10 seconds of the calculation time, the conversion of the flow product can be obtainedRate (y)2) Is determined by the parameters (A) of the high-dimensional approximation model2,i,kAnd B2,i,i′,k,k′) As shown in tables 4 and 5. Fitting R of approximation model2The value was 0.9967.
TABLE 4 product conversion (y)2) Is determined by the parameters (A) of the high-dimensional approximation model2,i,k)
TABLE 5 product conversion (y)2) Parameter (B) of the high-dimensional approximation model2,i,i′,k,k′)
Step S500: based on the high-dimensional approximation model obtained by the solution in step S400, the operation of the process is predicted and optimized with high precision.
The process production profit (y) solved in step S4001) High dimensional approximation model and product conversion (y)2) The fitting precision of the high-dimensional approximate model to the sample data is high (0.9993 and 0.9967). They can be used to predict production profits and product conversion rates for a process under other operating conditions. Fig. 4 is a comparison of the high-dimensional approximation model with the additionally generated 3125 sets of data, demonstrating the extremely high prediction accuracy of the high-dimensional approximation model.
The invention also provides a system modeling optimization storage medium, which executes the steps when being operated, analyzes the behavior characteristics of the object system, selects the input and output variables, determines the value range of each input variable, and acquires the input and output data sets;
for each output variable, determining a high-dimensional approximate model thereof;
solving undetermined parameters in the high-dimensional approximate model to ensure that the error between the model predicted value and the input and output data sets is minimum within the error range of the set high-dimensional approximate model;
and optimizing the behavior characteristics of the object system based on the solved high-dimensional approximate model.
Preferably, the storage medium, when executed, further performs the step of subject system data acquisition.
Specifically, the high-dimensional approximation model for each output variable y is:
in formula (1), N is the number of input variables x, K is the maximum order of the input variables x, i and i 'represent each specific variable x, K and K' represent the order of each variable x, and the model parameters include: C. a. thei,kAnd Bi,i′,k,k′Where C represents the zeroth order response to the output variable y; a. thei,kFinger input variable xiThe effect on the output variable y when acting alone; b isi,i′,k,k′Is an input variable xiAnd xi’The effect of the coupling on the output variable y.
Further, when the storage medium is operated, the undetermined parameters in the high-dimensional approximation model are solved, so that the error between the model prediction value and the input and output data sets is minimum within the set error range of the high-dimensional approximation model; the method comprises the following specific steps:
establishing a calculation relation between the output variable predicted value y and the input variable x based on a high-dimensional approximate model, as shown in formula (2), wherein a subscript N represents each group of data, N is the number of the groups of data,
constraining an error range (σ) of the high-dimensional approximation model; introducing two variables (ya) not less than 0nAnd ybn),y* nIs a sample value of the output variable;
0≤yan≤σ,n∈N (5)
0≤ybn≤σ,n∈N (6)
establishing a target value r for linear optimization such that the error (ya) between the predicted value and the sample value of the datan+ybn) Minimum, as in formula (7);
an error range (sigma) is set, and the initial order K of the input variable (x) is 1.
Solving the established linear optimization problem equation (7): judging whether the linear optimization problem formula (7) has a solution; if the solution exists, outputting a result, and stopping the algorithm; if no solution exists, the following steps are carried out:
the order of the input variable x is increased, and K is equal to K + 1. Returning to solve the linear optimization problem after the variable x order is updated again; by increasing the order K of the variable x.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrases "comprising … …" or "comprising … …" does not exclude the presence of additional elements in a process, method, article, or terminal that comprises the element. Further, herein, "greater than," "less than," "more than," and the like are understood to exclude the present numbers; the terms "above", "below", "within" and the like are to be understood as including the number.
As will be appreciated by one skilled in the art, the above-described embodiments may be provided as a method, apparatus, or computer program product. These embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. All or part of the steps in the methods according to the embodiments may be implemented by a program instructing associated hardware, where the program may be stored in a storage medium readable by a computer device and used to execute all or part of the steps in the methods according to the embodiments. The computer devices, including but not limited to: personal computers, servers, general-purpose computers, special-purpose computers, network devices, embedded devices, programmable devices, intelligent mobile terminals, intelligent home devices, wearable intelligent devices, vehicle-mounted intelligent devices, and the like; the storage medium includes but is not limited to: RAM, ROM, magnetic disk, magnetic tape, optical disk, flash memory, U disk, removable hard disk, memory card, memory stick, network server storage, network cloud storage, etc.
The various embodiments described above are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a computer apparatus to produce a machine, such that the instructions, which execute via the processor of the computer apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer device to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer apparatus to cause a series of operational steps to be performed on the computer apparatus to produce a computer implemented process such that the instructions which execute on the computer apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Although the embodiments have been described, once the basic inventive concept is obtained, other variations and modifications of these embodiments can be made by those skilled in the art, so that the above embodiments are only examples of the present invention, and not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes using the contents of the present specification and drawings, or any other related technical fields, which are directly or indirectly applied thereto, are included in the scope of the present invention.