WO2021015709A1 - A system and method for constructing mathematical models of technological processes and training these models - Google Patents

A system and method for constructing mathematical models of technological processes and training these models Download PDF

Info

Publication number
WO2021015709A1
WO2021015709A1 PCT/US2019/042555 US2019042555W WO2021015709A1 WO 2021015709 A1 WO2021015709 A1 WO 2021015709A1 US 2019042555 W US2019042555 W US 2019042555W WO 2021015709 A1 WO2021015709 A1 WO 2021015709A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
values
models
comprehensive
setup
Prior art date
Application number
PCT/US2019/042555
Other languages
French (fr)
Inventor
Dmitry Nikolaevich SHALUPKIN
Sergey Yurievich DEVYATKOV
Mikhail Andreevich SUPRUNOV
Evgenii Dmitrievich BUNIN
Original Assignee
Chemical Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chemical Technologies, Inc. filed Critical Chemical Technologies, Inc.
Priority to PCT/US2019/042555 priority Critical patent/WO2021015709A1/en
Publication of WO2021015709A1 publication Critical patent/WO2021015709A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Definitions

  • the invention relates to the field of technological processes modelling, in particular to systems for the development and training mathematical models of technological processes.
  • Mathematical modelling is one of the key elements in the study of industrial technologies allowing to solve a number of such important tasks as, e.g., comparing technological schemes, predicting systems' behaviour under various combinations of operating conditions, control and optimization, designing technological processes and equipment, etc.
  • Mathematical models of chemical processes are a synthesis of theoretical and experimental data; that is why such tasks as, e.g., chemical processes scaling, are almost never managed without laboratory experiments.
  • the processes' experimental research preceding mathematical model development is a laborious process, which is nearly always the most time-consuming procedure.
  • deterministic (analytical) mathematical models are nonlinear as a rule, which makes it very difficult to determine model parameters from experimental data. Applying of machine learning algorithms in describing processes inside of the deterministic models can significantly simplify the study of technological objects; in addition, due to physical nature of functional dependencies of such models, it is possible to extrapolate and apply the model beyond the data range whereon it was trained.
  • the present invention is a system for the development and training mathematical models of technological processes.
  • the data obtained online from laboratory set-up is used for these purposes.
  • Laboratory setup, as well as software and hardware complex that implements algorithms for efficiently training a mathematical model and finding values of unknown model parameters are an integral part of such system.
  • a comprehensive mathematical model of the process is based on deterministic models involving machine learning algorithms.
  • the trained model is applicable in calculations related to scaling, optimization, forecasting and process control.
  • Figure 1 shows unit diagram of the present invention's embodiment.
  • Figure 2 shows algorithm for constructing and training an integrated process model.
  • Figure 1 illustrates a system for training mathematical models of technological processes 100.
  • the system consists of laboratory chemical setup 102 and software and hardware complex 110.
  • Laboratory setup 102 consists directly of process equipment 104, control and measuring devices 106, as well as unit of analytical equipment 108.
  • Software and hardware complex 110 is interrelated with laboratory setup 102 and includes data processing module 112, module 114 for estimating unknown parameters and variables in the process model, training's validation module 116, and interface of user connectivity with the system 118. Moreover, software and hardware complex has a comprehensive mathematical model of a chemical process 120, which is a combination of an deterministic mathematical model 122 and a model based on machine learning algorithms 124.
  • Figure 2 illustrates algorithm for constructing and training the comprehensive mathematical model of technological process 120.
  • Based on the information received, an object's structural diagram and comprehensive model structure (204) are developed.
  • the number and type of deterministic equations (122) of the process under study are determined.
  • the technological process variables and parameters, processing of which will be carried out using machine learning algorithms (124) are also defined.
  • the specialist by himself builds a mathematical model 120 and inputs corresponding data into software and hardware complex 110 with the help of interface of user connectivity with the system 118.
  • software and hardware complex 110 contains a pre installed base of deterministic models, machine learning models as well as their combination into comprehensive models of technological processes and allows the operator to configure model structure based on the data provided using the interface 118.
  • laboratory setup 102 is data provider, wherein an experiment with registration of values of all the necessary process parameters and variables is conducted.
  • the use of data obtained online from an industrial setup or historical data is not excluded.
  • the collected data (208) is processed in order to convert it into a form suitable for performing calculations using comprehensive mathematical model based on the collected data.
  • an additional data set can be generated based on the already collected data, which is complementary and plays a supporting role in further work in the following steps (210) and (212).
  • the unknown model (210) parameters are determined.
  • the determination is performed using any suitable optimization algorithm that minimizes the deviations between experimental data and those calculated from the model.
  • Unknown parameters to be determined can be included both in a part of deterministic model (122) or in a part of machine learning model (124).
  • the step (210) can be called as training of mathematical model, since the parameter estimation algorithm is based on experimental data.
  • model validation (212) is carried out, which consists in obtaining the quality metric of a trained model of chemical process.
  • the algorithm is stopped, the result of which is a trained mathematical model (120). Otherwise, it returns to the previous step (206).
  • Technological processes are traditionally developed using a bottom-up approach, which involves the primary implementation of an industrial chemical setup prototype on a laboratory scale. Experiments conducted in a laboratory setup allow to build mathematical model of technological process 120, and determine its parameters for further scaling up of a process to industrial level.
  • parameters of kinetics model, mass transfer and heat transfer models, as well as their suitable form are subject to determination. For instance, kinetics model type depends on nature of chemicals, reaction conditions, catalyst type, technological parameters, etc.
  • Laboratory setups 102 are not unified; their configuration depends on the process being studied.
  • Catalytic processes are studied in reactors, e.g., continuous with a fixed catalyst bed, or of batch or semi-batch operation, equipped with additional mixing devices.
  • Mass transfer processes are carried out in column apparatuses; including in particular absorption or adsorption columns, distillation, extractive distillation, reactive distillation, etc.
  • Heat exchange processes are carried out in heat exchangers of various types: pipe-in-pipe system, shell-and-tube apparatus, etc.
  • laboratory setup is based on at least one of the above listed devices, in addition, such devices can be combined into a single system and operate in series, parallel, closed or incomplete recycling mode.
  • the substances transport lines are an integral part of laboratory setups, and the laboratory setup has at least one raw material supply line to the apparatus and one line for taking the product out of the apparatus.
  • such lines become input and output points and can be represented by nozzles, fittings, hatches, etc.
  • Transport lines provide communication between devices of chemical setup, and combine the devices into a single functioning system.
  • Lines can be equipped with optional hardware, such as, taps, dampers, valves, flow regulators and flowmeters, pumping dosing devices, and pressure regulators. Such equipment is necessary to control and organize substances flow.
  • Laboratory setup is equipped with control and measuring devices 106 and analytical equipment 108. Their function is to measure process variables. These variables include process indicators as follows: temperature, pressure, flow rate, fluid level in the apparatus, etc., as well as substances' physicochemical properties such as: viscosity, refractive index, medium composition, thermal conductivity, electrical conductivity, various spectrometric characteristics, etc.
  • Control and measuring devices also include thermal converters, thermistors, pressure sensors, pressure gauges, chromatographs, spectrometers, level gauges, etc. They also include the equipment installed on technological lines.
  • a wide range of available control and measuring devices 106 and analytical equipment 108 and their location in the laboratory setup allow to carefully collect data during the experiment. Such data are subject to digitization, and, subsequently, form the basis for building a model of chemical setup and carrying out calculations to determine parameters of the developed model. Conducting experiments in laboratory setup allows data to be obtained in a wider range of values compared with an industrial setup.
  • Software and hardware complex 110 is a computer or similar device capable of performing data operations and executing program code.
  • the core of software and hardware complex is represented by a comprehensive mathematical model of chemical process 120.
  • Deterministic model is a description of the process built on the physical essence of the phenomenon under consideration.
  • the Arrhenius equation describing the temperature dependence of the reaction constant can serve as an example. It is believed that such models are based on the laws of nature.
  • the stochastic model has a statistical nature; it is built on the basis of accumulated data on the phenomenon under consideration, and has a restriction regarding extrapolation beyond the limits of the data sample, on the basis of which it is constructed. Since the construction of such model is not based on the laws of nature, it is believed that it has a limited ability to generalize and predict beyond the experimentally obtained data.
  • the examples are the neural networks, logistic regression, etc.
  • the system includes a comprehensive mathematical model, which is a combination of deterministic equations 122 and machine learning algorithms 124 so that the results of calculations for machine learning algorithms are an input data for an deterministic process model.
  • a comprehensive mathematical model which is a combination of deterministic equations 122 and machine learning algorithms 124 so that the results of calculations for machine learning algorithms are an input data for an deterministic process model.
  • the comprehensive model structure is represented by the following equation 1:
  • f_complex value is a process variable, the value of which is determined based on the comprehensive model (1);
  • model (1) may include, among other things, differential equations, linear or nonlinear transformations over process variables, etc. In some extreme cases, zl and z2 may not perform any mathematical transformation;
  • m is a model based on a machine learning algorithm, the input data for which are variables and/or process parameters, mt also encapsulates learning parameters of such an algorithm, which are to be determined at the learning stage of a mathematical model, and not explicitly highlighted in (1);
  • Equation g lies in the fact that it is built on the basis of physical essence of the process.
  • m only describes some aspect of the process, e.g., it can be kinetics of chemical reaction, catalyst activity at a given time, viscosity or thermal conductivity of a medium, or any other process variable.
  • Algorithms for machine learning can be represented, but not limited to, the following: Linear regression; Logistic regression; Decision Tree; SVM; Naive Bayes; kNN; K-Means; Random Forest; Dimensionality reduction; Family of Gradient boosting algorithms; Neural networks for regression and/or classification purposes.
  • equations (1) can be combined into systems, which is necessary to describe several aspects of chemical processes simultaneously, e.g., kinetics, hydrodynamics, heat transfer, etc.
  • equation 1 may relate to any of the following classes of:
  • equation (2) already includes the time dependences for variables and parameters of technological process.
  • no less than one equation is used in order to calculate at least one process variable.
  • a comprehensive mathematical model of the process can combine several equations of (1) and/or (2) types, thus describing a part of the processes in a stationary state, and the other part in a dynamic one.
  • Equations (1), (2), as well as their combinations, are compiled in such a way that their number is sufficient to solve such a system.
  • solving this system we mean the determination of values of the process unknown variables on the basis of known variables and parameters.
  • the number of equations (1) and (2) in the mathematical model can exceed the minimum required for system to be solved, i.e. additional equations are needed to reduce the correlation between model parameters, reduce the correlation between model variables, establish boundaries or restrictions on parameter values, etc., which is important when assessing the values of unknown parameters or process variables.
  • Control and measuring devices 106 and analytical equipment 108 of laboratory setup are generally the main data collection tools on the basis of which the model is trained. Other ways of obtaining data on the process are also possible, e.g., the operator's registration of his/her observations using interface of connectivity with the system 118, or historical data can be used.
  • the data processing module 112 is represented by software tools that allow to make necessary conversions of information from laboratory setup. Data processing can be represented, but not limited to, the following mathematical transformations:
  • Such preliminary transformations are due to the fact that, basically control and measuring devices give information on process variables in a discrete mode, i.e. after a certain time interval; or such information may be primary, i.e., requiring conversion according to calibration curves, or requiring units' conversion, etc.
  • a linear transformation can be used to convert units of measure; in addition, calibration curves are also constructed linearly for the quantitative analysis of chemical components, e.g., in chromatography.
  • Nonlinear transformations also include quantitative determination of substance concentrations, e.g., in case when the calibration curve is described by a polynomial dependence, or the thermophysical properties of substances are described by polynomial dependencies.
  • Data differentiation by variable is necessary to calculate the change rate of a chemical process variable. For instance, the differentiation of the liquid level in the apparatus by a time variable will allow to determine the rate at which the fluid enters or leaves the apparatus at any given time. Differentiation of temperature changes in the apparatus by a time variable will allow determining the heating rate.
  • Integration is used, including in particular but not limited to, in case of conversion of spectral data from analytical instruments, e.g., the substance concentration is calculated on the basis of peak areas in the infrared absorption spectrum. Integration is also used in cases where the measured variable is the rate of change of the chemical process variable. For instance, flow meters show the flowrate of a substance per unit of time at a measurement point, thus, integration will allow to assess the entire volume of a substance that has passed through the measurement point through a given period of time.
  • data filtering we mean screening and elimination of measurements' part from the total sample. For instance, the elimination of erroneous measurements from data sample, which are usually associated with incorrect sensor operation.
  • Another example of filtering use is the selection of measurements part from a sample obtained during the stationary operation of chemical setup over a long period of time. Clearly, using the entire sample from stationary setup operating mode may be impractical, since it increases the time spent on its processing by software and hardware means.
  • Interpolation allows obtaining intermediate values of process variables, which is important in cases where measurements are received discretely.
  • Smoothing can be applied in cases where the variable measurement is accompanied by noise.
  • Various algorithms can be used for this procedure, e.g., moving average algorithm, high or low pass filters, fast Fourier transform, etc. The type of algorithm used is determined based on the noise nature accompanying the measurement.
  • Training a mathematical model is a procedure as a result of which all unknown values of the parameters and/or variables of a chemical process that are contained in the former will be found in such a way that subsequent results of model calculation will adequately reproduce the training dataset.
  • well-established techniques are known for model's training such as using the validation dataset, or following a cross-validation approach, and the present invention is not limited to any single one known technique of training.
  • Chemical process variables should be grouped in such a way that they relate to a specific state of the system. State of a system is its snapshot, reflecting the values of variables and parameters at a particular point of time. In case of a stationary system behaviour, at each moment of time, chemical process variables retain their constancy in their values (or reasonable fluctuations of a probabilistic nature, related to measurement error, etc. are allowed). The system dynamic behaviour at each particular moment of time differs from the previous one in the value of its variables or parameters and in general changes over the time.
  • the grouped variables and parameters of the chemical process corresponding to one state of the system are further divided into dependent and independent. Independent variables or parameters are used in a comprehensive mathematical model in order to calculate the values of dependent variables or parameters.
  • the estimation of unknown quantities in a comprehensive mathematical model is based on minimizing the objective function, which can be built on standard deviations (4), or on absolute deviations (5) between dependent variables determined experimentally ( y in equations 4 and 5) and on the basis of models (y A in equations 4 and 5). 7 is the number of states in equations 4 and 5.
  • the state of the system may include several dependent variables, the total number of which within each state is /. It can also vary between the states.
  • the calculation of the objective function is accompanied by the calculation of values of the dependent variables of chemical process. Such calculation can be carried out by various known methods and techniques that are relevant to a particular comprehensive mathematical model.
  • the objective function is minimized using any known optimization algorithm, e.g., annealing simulation, differential evolution, gradient descent, stochastic gradient descent, and others.
  • annealing simulation e.g., differential evolution, gradient descent, stochastic gradient descent, and others.
  • such parameters are evaluated simultaneously with each other.
  • the values that machine learning algorithm must produce in order for the objective function to reach the minimum value are estimated. After that, the generated dataset on machine learning values is used to train directly only the machine learning algorithm.
  • the machine learning algorithm describes a dependent variable of chemical process that was measured and included in dataset based on experiments.
  • machine learning algorithm is trained to determine the values of its parameters, at which the comprehensive mathematical model adequately describes all the sets of experimentally determined states of chemical system.
  • the objective function may contain weight factors for the most important process variables.
  • Independent variables on the basis of which machine learning algorithm performs calculations, are selected according to general ideas on the process under consideration. For instance, if it is assumed that the rate of catalyst deactivation depends on the composition of raw materials, temperature and pressure, then it is logical to choose these variables as independent variables of machine learning algorithm, as lying in the essence of the physicochemical process described by the algorithm.
  • the system of the present invention is a combination of technical and software and hardware elements.
  • chemical setup including instrumentation and analytical equipment, as technical means. That is, these are the means that are necessary for the functioning of chemical setup within the system and the measurement of variables and parameters of chemical setup.
  • Software and hardware means are computer devices aimed at implementing the functions for calculating a comprehensive mathematical model and implementing related algorithms for data processing, training a mathematical model, validating a model, and performing related operations for receiving, storing, displaying and transmitting the data.
  • the user interacting with the system performs its configuration both in terms of hardware related to chemical setup, and in terms of software and hardware computing facilities, as well as receiving the results of its functionality from the system.
  • the system interaction interface allows the user to specify the type of comprehensive mathematical model and determine: • type, kind and quantity of deterministic equations of chemical process used in comprehensive model;
  • the user also determines the algorithms according to which the system implements data processing, learning a comprehensive mathematical model, and validation a mathematical model. Such algorithms can be ready containing in system algorithms database.
  • the user can also implement his own algorithms or make additions to existing ones implemented within the system by adjusting the program code responsible for implementing the main functionality of the system.
  • the user configures system's technical component, namely, instrumentation devices in the volume that is available by default and determined by the manufacturer of each relevant device.
  • the validation execution consists in the implementation of the following integrated sequence of actions: • Calculating process variable values using comprehensive mathematical model based on data on independent process variables that were not previously a part of the training set. Such data comes online from a working laboratory setup;
  • system of the present invention can receive data online from laboratory setup that operates in a non-stationary transient mode. Based on the collected data, comprehensive mathematical model is trained describing the dynamic behaviour of laboratory setup or its part, depending on the magnitude and intensity of the disturbing signals.
  • validation of comprehensive model is carried out taking into account data from non-stationary transient modes, which increases the estimation reliability of model parameters.
  • a system for constructing and learning mathematical models consisting of laboratory setup and software and hardware complex that implements the execution of computations of a comprehensive mathematical model, with the latter being a combination of deterministic models and machine learning algorithms.
  • the results of calculations using mathematical models based on machine learning are used as arguments in calculations using deterministic mathematical models.
  • Laboratory setup is the data provider for software and hardware complex, and the model is validated during the experiment.
  • the claimed complex allows online creation of the trained models, minimizing time costs and involved data volume for training a model.
  • the model provides real management functions, forecasting and optimization, and can be a part of such packages that provide model-based online optimization and process control.
  • the system of embodiment 1 is used in conjunction with industrial setup. To extend the existing mathematical model of an industrial setup or its aspects, experiments are conducted using the system of embodiment 1 in an extended range of independent values of the variables of chemical process. This approach allows to estimate the impact of new raw material, atypical values of technological parameters of chemical setup, and obtain a validated and refined model of industrial setup or its aspect.
  • a method of developing comprehensive mathematical models consisting of building comprehensive mathematical model expressed by at least one deterministic mathematical model and at least one model based on machine learning algorithm, describing at least a part of technological setup, which allows to calculate the process variables on the basis of independent process variables, dependent process variables, process parameters, or a combination of both; in addition, the results of calculations using mathematical models on the basis of machine learning are used as arguments in calculations using deterministic mathematical models.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Mathematics (AREA)
  • Algebra (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention represents a system for the development and training of mathematical models of technological processes. The data obtained from laboratory set-up is used for these purposes. Laboratory setup, as well as software and hardware complex that implements algorithms for efficiently training a mathematical model and finding values of unknown model parameters are an integral part of such system. A comprehensive mathematical model of the process is based on deterministic models wrapped around machine learning algorithms. The trained model is applicable in calculations related to scaling, optimization, forecasting and process control. Said system for developing comprehensive mathematical models of processes allows minimizing the necessary amount of experimental data and time for training the mathematical model of technological process.

Description

A SYSTEM AND METHOD FOR CONSTRUCTING MATHEMATICAL MODELS OF TECHNOLOGICAL
PROCESSES AND TRAINING THESE MODELS
Technical Field
The invention relates to the field of technological processes modelling, in particular to systems for the development and training mathematical models of technological processes.
State of Art
Mathematical modelling is one of the key elements in the study of industrial technologies allowing to solve a number of such important tasks as, e.g., comparing technological schemes, predicting systems' behaviour under various combinations of operating conditions, control and optimization, designing technological processes and equipment, etc.
There are various approaches to mathematical models development. If the system internal parameters are determined only by the experimental data obtained on the operating object, then the formal (empirical) models obtained by this method are applicable only to the object whereon the experiments were conducted. The structure of functional dependencies of such models is built on the basis of considerations that have no connection with the type of technological object and its design features, as well as the mechanisms of the processes occurring in it. As an example, we can consider various RTO systems based on machine learning algorithms (ML), which are now widely used (WO 2018/035718 Al, US 6246972 Bl). The positive features of such models are that less time is required for their development and engineering analysis, and their use allows to effectively solve issues such as process control, state prediction, and optimization. However, such systems are practically cannot be extrapolated, they are limited to use only in the range of values used for the model training. Training such models, as a rule, requires a large amount of accumulated data; furthermore, when changing the configuration of such systems or when the operating parameters exceed the limits used to train the model, its re-training is required.
An analytical approach to model's construction is also possible, when individual physicochemical processes that take place at a technological setup are investigated in laboratory set-ups or directly at technological facilities. The functional dependences of such models (we call them as deterministic models throughout the text), as compared with empirical models, have a distinct physical interpretation and can also be used in other systems (US 10046295 B2). In order to develop models of chemical processes, it is preferable to use such deterministic models containing extensive information on the properties of processed substances, as well as physical and chemical processes occurring in the system (rate constants, activation energy, heat and mass transfer coefficients, diffusion, etc.).
Mathematical models of chemical processes are a synthesis of theoretical and experimental data; that is why such tasks as, e.g., chemical processes scaling, are almost never managed without laboratory experiments. The processes' experimental research preceding mathematical model development is a laborious process, which is nearly always the most time-consuming procedure. Furthermore, deterministic (analytical) mathematical models are nonlinear as a rule, which makes it very difficult to determine model parameters from experimental data. Applying of machine learning algorithms in describing processes inside of the deterministic models can significantly simplify the study of technological objects; in addition, due to physical nature of functional dependencies of such models, it is possible to extrapolate and apply the model beyond the data range whereon it was trained.
As can be seen from the above, there is a need for methods of technological processes models development and for systems that allow to create and train said models, moreover these models needs to be suitable both for application in the fields of management, forecasting and optimization of technological process, and in the development and scaling of chemical, bio- and food industry processes. Aspects of deterministic models and models based on machine learning algorithms, organized in a proper hierarchical manner, which could be combined in a single comprehensive mathematical model, will solve the problem of obtaining reliable mathematical dependencies for these purposes.
Brief Description of the Invention
According to the first aspect, the present invention is a system for the development and training mathematical models of technological processes. The data obtained online from laboratory set-up is used for these purposes. Laboratory setup, as well as software and hardware complex that implements algorithms for efficiently training a mathematical model and finding values of unknown model parameters are an integral part of such system. A comprehensive mathematical model of the process is based on deterministic models involving machine learning algorithms. The trained model is applicable in calculations related to scaling, optimization, forecasting and process control.
According to another invention's aspect, a method of using a system for developing comprehensive mathematical models of processes is disclosed, which allows minimizing the necessary amount of experimental data for training the mathematical model of technological process. These invention's embodiments, as well as other aspects and embodiments thereof are discussed in more detail in the following description, made in accordance with the accompanying drawings.
Drawings Description
Figure 1 shows unit diagram of the present invention's embodiment.
Figure 2 shows algorithm for constructing and training an integrated process model.
Detailed Description of the Invention
Figure 1 illustrates a system for training mathematical models of technological processes 100. The system consists of laboratory chemical setup 102 and software and hardware complex 110.
Laboratory setup 102 consists directly of process equipment 104, control and measuring devices 106, as well as unit of analytical equipment 108.
Software and hardware complex 110 is interrelated with laboratory setup 102 and includes data processing module 112, module 114 for estimating unknown parameters and variables in the process model, training's validation module 116, and interface of user connectivity with the system 118. Moreover, software and hardware complex has a comprehensive mathematical model of a chemical process 120, which is a combination of an deterministic mathematical model 122 and a model based on machine learning algorithms 124.
Figure 2 illustrates algorithm for constructing and training the comprehensive mathematical model of technological process 120. Study of technological object (202), which requires model construction, takes place at the first stage. The object's design and physicochemical processes occurring in it are under study. Based on the information received, an object's structural diagram and comprehensive model structure (204) are developed. The number and type of deterministic equations (122) of the process under study are determined. The technological process variables and parameters, processing of which will be carried out using machine learning algorithms (124) are also defined.
In one of the present invention's embodiments, the specialist by himself builds a mathematical model 120 and inputs corresponding data into software and hardware complex 110 with the help of interface of user connectivity with the system 118. In the preferred present invention's embodiment, software and hardware complex 110 contains a pre installed base of deterministic models, machine learning models as well as their combination into comprehensive models of technological processes and allows the operator to configure model structure based on the data provided using the interface 118.
The data needed to train a model (206) is collected in the following step. In the present invention's embodiment, laboratory setup 102 is data provider, wherein an experiment with registration of values of all the necessary process parameters and variables is conducted. However, the use of data obtained online from an industrial setup or historical data is not excluded.
Then, the collected data (208) is processed in order to convert it into a form suitable for performing calculations using comprehensive mathematical model based on the collected data. Besides, in step (208), an additional data set can be generated based on the already collected data, which is complementary and plays a supporting role in further work in the following steps (210) and (212).
Subsequently, the unknown model (210) parameters are determined. The determination is performed using any suitable optimization algorithm that minimizes the deviations between experimental data and those calculated from the model. Unknown parameters to be determined can be included both in a part of deterministic model (122) or in a part of machine learning model (124). The step (210) can be called as training of mathematical model, since the parameter estimation algorithm is based on experimental data.
After estimating the values of unknown parameters, model validation (212) is carried out, which consists in obtaining the quality metric of a trained model of chemical process. When obtaining satisfactory metric value, the algorithm is stopped, the result of which is a trained mathematical model (120). Otherwise, it returns to the previous step (206).
Laboratory Setup
Technological processes are traditionally developed using a bottom-up approach, which involves the primary implementation of an industrial chemical setup prototype on a laboratory scale. Experiments conducted in a laboratory setup allow to build mathematical model of technological process 120, and determine its parameters for further scaling up of a process to industrial level. In the chemical industry, parameters of kinetics model, mass transfer and heat transfer models, as well as their suitable form are subject to determination. For instance, kinetics model type depends on nature of chemicals, reaction conditions, catalyst type, technological parameters, etc.
Laboratory setups 102 are not unified; their configuration depends on the process being studied. Catalytic processes are studied in reactors, e.g., continuous with a fixed catalyst bed, or of batch or semi-batch operation, equipped with additional mixing devices. Mass transfer processes are carried out in column apparatuses; including in particular absorption or adsorption columns, distillation, extractive distillation, reactive distillation, etc. Heat exchange processes are carried out in heat exchangers of various types: pipe-in-pipe system, shell-and-tube apparatus, etc. In the present invention's embodiment, laboratory setup is based on at least one of the above listed devices, in addition, such devices can be combined into a single system and operate in series, parallel, closed or incomplete recycling mode.
The substances transport lines are an integral part of laboratory setups, and the laboratory setup has at least one raw material supply line to the apparatus and one line for taking the product out of the apparatus. In some extreme case, such lines become input and output points and can be represented by nozzles, fittings, hatches, etc.
Transport lines provide communication between devices of chemical setup, and combine the devices into a single functioning system. Lines can be equipped with optional hardware, such as, taps, dampers, valves, flow regulators and flowmeters, pumping dosing devices, and pressure regulators. Such equipment is necessary to control and organize substances flow.
Laboratory setup is equipped with control and measuring devices 106 and analytical equipment 108. Their function is to measure process variables. These variables include process indicators as follows: temperature, pressure, flow rate, fluid level in the apparatus, etc., as well as substances' physicochemical properties such as: viscosity, refractive index, medium composition, thermal conductivity, electrical conductivity, various spectrometric characteristics, etc. Control and measuring devices also include thermal converters, thermistors, pressure sensors, pressure gauges, chromatographs, spectrometers, level gauges, etc. They also include the equipment installed on technological lines.
A wide range of available control and measuring devices 106 and analytical equipment 108 and their location in the laboratory setup allow to carefully collect data during the experiment. Such data are subject to digitization, and, subsequently, form the basis for building a model of chemical setup and carrying out calculations to determine parameters of the developed model. Conducting experiments in laboratory setup allows data to be obtained in a wider range of values compared with an industrial setup.
Software and Hardware Complex
Software and hardware complex 110 is a computer or similar device capable of performing data operations and executing program code. The core of software and hardware complex is represented by a comprehensive mathematical model of chemical process 120.
Comprehensive Mathematical Model
Two types of models can be distinguished, based of which the real systems are modelled, deterministic and stochastic, are typically built. Deterministic model is a description of the process built on the physical essence of the phenomenon under consideration. The Arrhenius equation describing the temperature dependence of the reaction constant can serve as an example. It is believed that such models are based on the laws of nature. The stochastic model has a statistical nature; it is built on the basis of accumulated data on the phenomenon under consideration, and has a restriction regarding extrapolation beyond the limits of the data sample, on the basis of which it is constructed. Since the construction of such model is not based on the laws of nature, it is believed that it has a limited ability to generalize and predict beyond the experimentally obtained data. The examples are the neural networks, logistic regression, etc.
In the present invention, the system includes a comprehensive mathematical model, which is a combination of deterministic equations 122 and machine learning algorithms 124 so that the results of calculations for machine learning algorithms are an input data for an deterministic process model. Conventionally, the comprehensive model structure is represented by the following equation 1:
Figure imgf000008_0001
In equation 1:
• f_complex value is a process variable, the value of which is determined based on the comprehensive model (1);
• Par and var are the parameters and variables of technological process, which can have both scalar and vector representation; • g (...) is directly the deterministic mathematical model of the process being learned;
• zl, z2 represent a function or operator, so model (1) may include, among other things, differential equations, linear or nonlinear transformations over process variables, etc. In some extreme cases, zl and z2 may not perform any mathematical transformation;
• m (...) is a model based on a machine learning algorithm, the input data for which are variables and/or process parameters, m (...) also encapsulates learning parameters of such an algorithm, which are to be determined at the learning stage of a mathematical model, and not explicitly highlighted in (1);
• & | notation means "AND/OR".
In this case, the limitation imposed on the equation g (...) lies in the fact that it is built on the basis of physical essence of the process. In the context of hierarchy of the presented equations (1) and (2), it can be seen that m (...) only describes some aspect of the process, e.g., it can be kinetics of chemical reaction, catalyst activity at a given time, viscosity or thermal conductivity of a medium, or any other process variable.
Algorithms for machine learning can be represented, but not limited to, the following: Linear regression; Logistic regression; Decision Tree; SVM; Naive Bayes; kNN; K-Means; Random Forest; Dimensionality reduction; Family of Gradient boosting algorithms; Neural networks for regression and/or classification purposes.
The main advantage of machine learning algorithms is that during implementation it is not necessary to know physical essence of the process or phenomenon in question in order to implement its description.
Furthermore, equations (1) can be combined into systems, which is necessary to describe several aspects of chemical processes simultaneously, e.g., kinetics, hydrodynamics, heat transfer, etc.
Furthermore, equation 1 may relate to any of the following classes of:
• linear equations;
• nonlinear equations;
• differential equations;
• delayd differential equations;
• partial differential equations; equation systems of the above classes.
Furthermore, if the chemical process is described in a non-stationary state, then the time variable t is added to equation (1):
Figure imgf000010_0001
Thus, equation (2) already includes the time dependences for variables and parameters of technological process.
To describe the process, no less than one equation is used in order to calculate at least one process variable.
In addition, a comprehensive mathematical model of the process can combine several equations of (1) and/or (2) types, thus describing a part of the processes in a stationary state, and the other part in a dynamic one.
Equations (1), (2), as well as their combinations, are compiled in such a way that their number is sufficient to solve such a system. By solving this system we mean the determination of values of the process unknown variables on the basis of known variables and parameters.
In another embodiment, the number of equations (1) and (2) in the mathematical model can exceed the minimum required for system to be solved, i.e. additional equations are needed to reduce the correlation between model parameters, reduce the correlation between model variables, establish boundaries or restrictions on parameter values, etc., which is important when assessing the values of unknown parameters or process variables.
An important aspect of a comprehensive mathematical model is that such a model can be partially or fully used on various setups of a similar type for the purposes of process control, optimization and forecasting. Partial use of the model means the use of both individual equations of the form (1) or (2), and parts of these equations, that are machine learning algorithm, and model parameters. In addition to the above, before using on another setup, these parameters and machine learning algorithms must be previously determined on the basis of experiments, if they have not yet been defined. Data Processing
Before performing calculations in software and hardware complex 110, it is often necessary to perform preliminary data processing in order to improve calculations quality. Control and measuring devices 106 and analytical equipment 108 of laboratory setup are generally the main data collection tools on the basis of which the model is trained. Other ways of obtaining data on the process are also possible, e.g., the operator's registration of his/her observations using interface of connectivity with the system 118, or historical data can be used. The data processing module 112 is represented by software tools that allow to make necessary conversions of information from laboratory setup. Data processing can be represented, but not limited to, the following mathematical transformations:
• Linear transformations;
• Other nonlinear algebraic transformations;
• Differentiation of data by any variable;
• Integrating data by any variable;
• Filtering;
• Interpolation;
• Smoothing;
• Normalization/Denormalization.
Such preliminary transformations are due to the fact that, basically control and measuring devices give information on process variables in a discrete mode, i.e. after a certain time interval; or such information may be primary, i.e., requiring conversion according to calibration curves, or requiring units' conversion, etc.
For instance, a linear transformation can be used to convert units of measure; in addition, calibration curves are also constructed linearly for the quantitative analysis of chemical components, e.g., in chromatography.
Nonlinear transformations also include quantitative determination of substance concentrations, e.g., in case when the calibration curve is described by a polynomial dependence, or the thermophysical properties of substances are described by polynomial dependencies.
Data differentiation by variable is necessary to calculate the change rate of a chemical process variable. For instance, the differentiation of the liquid level in the apparatus by a time variable will allow to determine the rate at which the fluid enters or leaves the apparatus at any given time. Differentiation of temperature changes in the apparatus by a time variable will allow determining the heating rate.
Integration is used, including in particular but not limited to, in case of conversion of spectral data from analytical instruments, e.g., the substance concentration is calculated on the basis of peak areas in the infrared absorption spectrum. Integration is also used in cases where the measured variable is the rate of change of the chemical process variable. For instance, flow meters show the flowrate of a substance per unit of time at a measurement point, thus, integration will allow to assess the entire volume of a substance that has passed through the measurement point through a given period of time.
By data filtering we mean screening and elimination of measurements' part from the total sample. For instance, the elimination of erroneous measurements from data sample, which are usually associated with incorrect sensor operation. Another example of filtering use is the selection of measurements part from a sample obtained during the stationary operation of chemical setup over a long period of time. Apparently, using the entire sample from stationary setup operating mode may be impractical, since it increases the time spent on its processing by software and hardware means.
Interpolation allows obtaining intermediate values of process variables, which is important in cases where measurements are received discretely.
Smoothing can be applied in cases where the variable measurement is accompanied by noise. Various algorithms can be used for this procedure, e.g., moving average algorithm, high or low pass filters, fast Fourier transform, etc. The type of algorithm used is determined based on the noise nature accompanying the measurement.
The need for normalization of data samples is due to the variables' nature used in the models. Being different in physical meaning, they can often vary greatly in absolute values. Data normalization allows to bring all the used numerical values of variables to the same area of their change, making it possible to bring them together in one model. This approach increases stability of mathematical models' solutions. Normalization can be performed either linearly or nonlinearly (using a sigmoidal function or similar).
To perform data normalization, one needs to know precisely the limits of change in the values of corresponding variables (the minimum and maximum theoretically possible values). Then they will correspond to the limits of the normalization interval. When it is impossible to precisely set the limits of variables change, they are set taking into account the minimum and maximum values in the available dataset.
By combining the approaches to data processing, it is possible to supplement the incoming data from control and measuring devices of a setup with those that will be useful for calculations, as well as to transform them.
Training a Comprehensive Mathematical Model
Training a mathematical model is a procedure as a result of which all unknown values of the parameters and/or variables of a chemical process that are contained in the former will be found in such a way that subsequent results of model calculation will adequately reproduce the training dataset. At this point, well-established techniques are known for model's training such as using the validation dataset, or following a cross-validation approach, and the present invention is not limited to any single one known technique of training.
Chemical process variables should be grouped in such a way that they relate to a specific state of the system. State of a system is its snapshot, reflecting the values of variables and parameters at a particular point of time. In case of a stationary system behaviour, at each moment of time, chemical process variables retain their constancy in their values (or reasonable fluctuations of a probabilistic nature, related to measurement error, etc. are allowed). The system dynamic behaviour at each particular moment of time differs from the previous one in the value of its variables or parameters and in general changes over the time.
The grouped variables and parameters of the chemical process corresponding to one state of the system are further divided into dependent and independent. Independent variables or parameters are used in a comprehensive mathematical model in order to calculate the values of dependent variables or parameters.
Conversely, in order to find the unknown parameters or variables in the comprehensive model, it is necessary that the dataset for each state of the system contains at least one experimental measurement of the dependent process variable.
The estimation of unknown quantities in a comprehensive mathematical model is based on minimizing the objective function, which can be built on standard deviations (4), or on absolute deviations (5) between dependent variables determined experimentally ( y in equations 4 and 5) and on the basis of models (yA in equations 4 and 5). 7 is the number of states in equations 4 and 5.
Furthermore, the state of the system may include several dependent variables, the total number of which within each state is /. It can also vary between the states.
Figure imgf000014_0001
The calculation of the objective function is accompanied by the calculation of values of the dependent variables of chemical process. Such calculation can be carried out by various known methods and techniques that are relevant to a particular comprehensive mathematical model.
The objective function is minimized using any known optimization algorithm, e.g., annealing simulation, differential evolution, gradient descent, stochastic gradient descent, and others. When determining the unknown parameters of a comprehensive mathematical model, it should be taken into account that there are parameters that are a part of deterministic equations, and parameters that are a part of machine learning algorithm.
In one embodiment, such parameters are evaluated simultaneously with each other.
In another embodiment, instead of machine learning algorithm parameters, the values that machine learning algorithm must produce in order for the objective function to reach the minimum value are estimated. After that, the generated dataset on machine learning values is used to train directly only the machine learning algorithm.
In another embodiment, the machine learning algorithm describes a dependent variable of chemical process that was measured and included in dataset based on experiments. Thus, only machine learning algorithm is trained to determine the values of its parameters, at which the comprehensive mathematical model adequately describes all the sets of experimentally determined states of chemical system.
In another invention's embodiment, the objective function may contain weight factors for the most important process variables.
Independent variables, on the basis of which machine learning algorithm performs calculations, are selected according to general ideas on the process under consideration. For instance, if it is assumed that the rate of catalyst deactivation depends on the composition of raw materials, temperature and pressure, then it is logical to choose these variables as independent variables of machine learning algorithm, as lying in the essence of the physicochemical process described by the algorithm.
User Interface
The system of the present invention is a combination of technical and software and hardware elements. In this case, we understand chemical setup, including instrumentation and analytical equipment, as technical means. That is, these are the means that are necessary for the functioning of chemical setup within the system and the measurement of variables and parameters of chemical setup.
Software and hardware means are computer devices aimed at implementing the functions for calculating a comprehensive mathematical model and implementing related algorithms for data processing, training a mathematical model, validating a model, and performing related operations for receiving, storing, displaying and transmitting the data.
Technical and software and hardware parts have communication means sufficient for the exchange of information between each other. Thus, the values of chemical process variables can come online from the technical part of the system to software and hardware part in order to perform their immediate processing.
The user interacting with the system performs its configuration both in terms of hardware related to chemical setup, and in terms of software and hardware computing facilities, as well as receiving the results of its functionality from the system.
The system interaction interface allows the user to specify the type of comprehensive mathematical model and determine: • type, kind and quantity of deterministic equations of chemical process used in comprehensive model;
• type, kind and quantity of implemented machine learning algorithms as a part of mathematical model;
• variables and parameters in mathematical model that are unknown and subject to evaluation at a learning stage of mathematical model;
• variables and parameters in mathematical model whose values are taken from experimental data;
The user also determines the algorithms according to which the system implements data processing, learning a comprehensive mathematical model, and validation a mathematical model. Such algorithms can be ready containing in system algorithms database.
The user can also implement his own algorithms or make additions to existing ones implemented within the system by adjusting the program code responsible for implementing the main functionality of the system.
The user configures system's technical component, namely, instrumentation devices in the volume that is available by default and determined by the manufacturer of each relevant device.
Model Validation
Conducting experimental work on studying technological processes on a laboratory scale sometimes requires a fairly long continuous operation of a setup, since the ongoing processes must go through an induction period and reach a steady state. The studying is usually conducted by varying the operating modes of the setup as follows: composition of raw materials, pressure, temperature, media flowrates, etc. A large amount of data is required to build an accurate and adequate process model. On the other hand, reduction of the experimental time and declining in the amount of accumulated data is relevant, since this process is associated with large financial and time costs.
The validation execution consists in the implementation of the following integrated sequence of actions: • Calculating process variable values using comprehensive mathematical model based on data on independent process variables that were not previously a part of the training set. Such data comes online from a working laboratory setup;
• Comparing the measured variables of technological process at setup with the results of the calculation based on comprehensive mathematical model, calculation of the deviation value by the RMSE method or similar to obtain the deviation metrics of the experimental data from those calculated at the validation point;
• Deciding whether to stop training the model or to include a validation point in the training dataset with a subsequent training of the model on newly formed dataset and moving to the next experimental point on laboratory setup.
Furthermore, the system of the present invention can receive data online from laboratory setup that operates in a non-stationary transient mode. Based on the collected data, comprehensive mathematical model is trained describing the dynamic behaviour of laboratory setup or its part, depending on the magnitude and intensity of the disturbing signals.
In one invention's embodiment, validation of comprehensive model is carried out taking into account data from non-stationary transient modes, which increases the estimation reliability of model parameters.
Preferred Invention's Embodiments
1. A system for constructing and learning mathematical models, consisting of laboratory setup and software and hardware complex that implements the execution of computations of a comprehensive mathematical model, with the latter being a combination of deterministic models and machine learning algorithms. The results of calculations using mathematical models based on machine learning are used as arguments in calculations using deterministic mathematical models. Laboratory setup is the data provider for software and hardware complex, and the model is validated during the experiment. Thus, the claimed complex allows online creation of the trained models, minimizing time costs and involved data volume for training a model.
2. The system according to embodiment 1, with the help of which mathematical model was developed and trained, that is suitable for scaling and designing industrial setup, carrying out optimization calculations. 3. The system according to embodiment 1, with the help of which mathematical model was developed and trained, suitable for being used on at least one setup of a similar type, but of a different scale. The model provides real management functions, forecasting and optimization, and can be a part of such packages that provide model-based online optimization and process control.
4. The system according to embodiment 1, with the help of which mathematical model was developed and trained, which is fully or partially applied on at least one setup of a different type, provided that the processes occurring at setups are described by the same equations. The model provides the implementation of controlling functions, forecasting and optimization.
5. The system of embodiment 1 is used in conjunction with industrial setup. To extend the existing mathematical model of an industrial setup or its aspects, experiments are conducted using the system of embodiment 1 in an extended range of independent values of the variables of chemical process. This approach allows to estimate the impact of new raw material, atypical values of technological parameters of chemical setup, and obtain a validated and refined model of industrial setup or its aspect.
6. Software and hardware complex that implements the performance of calculations of comprehensive mathematical model based on historical data, wherein said model being a combination of deterministic models and machine learning algorithms. The results of calculations using mathematical models based on machine learning are used as arguments in calculations using deterministic mathematical models.
7. A method of developing comprehensive mathematical models, consisting of building comprehensive mathematical model expressed by at least one deterministic mathematical model and at least one model based on machine learning algorithm, describing at least a part of technological setup, which allows to calculate the process variables on the basis of independent process variables, dependent process variables, process parameters, or a combination of both; in addition, the results of calculations using mathematical models on the basis of machine learning are used as arguments in calculations using deterministic mathematical models. The system according to embodiment 1 wherein the learning of mathematical model is carried out with the involvement of data from transient non-stationary modes, which allows to obtain comprehensive mathematical models based on deterministic models and algorithms of machine learning, and mainly aimed at describing the dynamic behaviour of the system, as well as the implementation of model-based control; The system according to embodiment 1 using which a mathematical model of the process or its aspect was developed and trained, which can then be used to carry out calculations for the optimization and control of a similar industrial setup, wherein the location of instrumentation control devices in the industrial setup may not coincide with laboratory setup of the system. The system according to embodiment 1 using which a comprehensive mathematical model was developed and trained and further can be used as a part of a virtual sensor to determine process variables that cannot be directly measured. As well as combinations of embodiments 2-10.

Claims

WHAT IS CLAIMED:
1. A system for building comprehensive mathematical models of technological processes and training these models, including:
a. at least one laboratory setup equipped with sensors, wherein said sensors are for measuring dependent and independent process variables or a combination thereof; b. software and technical computer means that interact with each other and with laboratory setup, configured to execute the program code that implements the functions: i. performing computations of a comprehensive mathematical models, which are a combination of at least one deterministic mathematical model and at least one model based on machine learning algorithm, said comprehensive models are configured to simulate at least a part of setup and calculate at least a portion of process variables based on values of independent process variables, dependent process variables, process parameters or a combination thereof; in addition, the results of calculating the mathematical models based on machine learning algorithms are used as arguments of deterministic mathematical models;
ii. training a comprehensive mathematical model using software and hardware complex, the latter configured to estimate the values of at least a portion of parameters, dependent variables, independent variables, or a combination thereof based on data from laboratory setup;
iii. data processing, implemented as a data processing module, configured to convert, store and exchange information inside and outside the system, for interaction between laboratory setup and software and hardware complex;
iv. user interaction with the system for configuring individual system modules for input/output in digital and/or graphical form of information from the system;
v. validating and adjusting the model.
2. The system according to claim 1, wherein laboratory setup is at least one apparatus or their combination, including:
a. fixed bed reactor;
b. reactor with mixing device;
c. reactor with suspended catalyst bed;
d. distillation column;
e. extraction column;
f. absorption column; g. adsorption column;
h. reactive distillation column;
i. heat exchanger.
3. The system according to claim 1, wherein the dependent and independent variables of the process are:
a. technological parameters of the process, at least including temperature of the medium, pressure in the apparatus, flow rate of the medium, and volume of fluid in the apparatus; b. physical and chemical properties of the medium, at least including viscosity, acidity, and density;
c. spectral characteristics of the medium;
d. any other characteristics to be determined using instrumentation and/or sensors of physical and chemical properties;
e. values obtained as a result of calculations based on measurements from instrumentation and/or sensors of physical and chemical properties.
4. The system according to claim 1, wherein deterministic mathematical models are based on the physical essence of the phenomenon to be described by said models; and said models include the description at least of:
a. Kinetics of physical and chemical processes;
b. Thermodynamics of physical and chemical processes;
c. Hydrodynamics;
d. Heat transfer;
e. Mass transfer;
f. Material balance;
g. Energy balance; or combination thereof.
5. The system according to claim 1, wherein mathematical models based on machine learning algorithms are used to calculate the process variables, parameters or a combination thereof, which are not measured directly, and the said machine learning algorithms are represented by the following methods:
a. Linear regression;
b. Logistic regression;
c. Decision Tree;
d. SVM; e. Naive Bayes;
f. kNN;
g. K-Means;
h. Random Forest;
i. Dimensionality reduction;
j. Family of Gradient boosting algorithms;
k. Neural networks for regression and/or classification purposes.
6. The system according to claim 1, wherein software and hardware complex is configured to train the comprehensive model based on the available historical data from the setup.
7. The system according to claim 1, wherein software and hardware complex is configured to train a comprehensive model based on online data from the operating setup.
8. The system according to claim 1, wherein the data processing module is configured to convert information into a form suitable for subsequent mathematical processing within software and hardware complex, including at least the following approaches or a combination thereof: a. Filtration, which consists in identifying atypical or erroneous values of variables coming from the sensors of the setup;
b. Interpolation, which consists in identifying the intermediate value of the process variable, based on previous and subsequent measurements of said variable;
c. Data smoothening based on algorithms of noise elimination from a series of consecutive measurements with a given degree of smoothing intensity;
d. Normalization and denormalization of the values of the process variable, which correspondingly consist in bringing the said values in the range from 0 to 1 inclusive and backward transformation;
e. Transformation of spectral data into numerical values using baseline correction, numerical integration for calculating peak areas, finding peak heights, mutual subtraction of spectra, and a combination thereof;
f. Linear transformation of the measured data;
g. Numerical differentiation of dataset by any of the variables.
9. The system according to claim 1, wherein the trained comprehensive mathematical model of technological setup or its aspects are adapted from a similar system, wherein in these systems: a. the comprehensive mathematical models describe similar physicochemical process; b. the type and structure of the used machine learning algorithm are same.
10. The system according to claim 9, wherein the additional training of the trained comprehensive model or its part is performed.
11. The system according to claim 1, wherein the creation of comprehensive mathematical models and training of said models are carried out in accordance with the following sequence of actions:
a. building the comprehensive mathematical model consisting of at least one deterministic mathematical model and at least one model based on machine learning algorithms, describing at least a part of technological setup, which allows to calculate at least a portion of process variables based on values of independent process variables, dependent process variables, process parameters or a combination thereof; in addition, the results of calculating the mathematical models based on machine learning algorithms are used as arguments of deterministic mathematical models;
b. conducting an experiment with laboratory setup, equipped with sensors to measure the values of process variables;
c. transforming of at least a part of the data obtained as a result of the experiment, said transformation is performed by the processing module into a form suitable for subsequent use in software and hardware complex;
d. using the software and hardware complex for training a comprehensive model consisting in estimating the values of at least portion of parameters, dependent variables, independent process variables, or a combination thereof that make up the comprehensive mathematical model, based on the values of the process variables from the setup;
e. validating and adjusting the model online based on the incoming data to the software and hardware complex.
12. The system according to claim 1, wherein the model training, consisting in estimating the values of at least one parameter of the comprehensive mathematical model, is performed by minimizing an objective function, representing the metric of the deviation of experimental data from the data calculated by the model, using any local or global optimization algorithm.
13. The system according to claim 1, wherein the validation and adjustment of the model is implemented as follows: a. obtaining the values of process variables that were not previously a part of the training set;
b. calculating the process variable values using the comprehensive mathematical model, based on values obtained on step (a);
c. comparing the process variables from step (a) with the results of their determination based on the comprehensive mathematical model performed on step (b), and calculating the deviation metric between these values;
d. including of the variable values from step (a) into the training dataset with performing an additional model training on a newly formed training dataset with the subsequent decision to stop or continue the model training.
14. The system according to claim 1, comprising the comprehensive mathematical model, trained on a similar technological setup or on a part thereof, said system is different in scale, values of independent variables, or combination thereof.
PCT/US2019/042555 2019-07-19 2019-07-19 A system and method for constructing mathematical models of technological processes and training these models WO2021015709A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2019/042555 WO2021015709A1 (en) 2019-07-19 2019-07-19 A system and method for constructing mathematical models of technological processes and training these models

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2019/042555 WO2021015709A1 (en) 2019-07-19 2019-07-19 A system and method for constructing mathematical models of technological processes and training these models

Publications (1)

Publication Number Publication Date
WO2021015709A1 true WO2021015709A1 (en) 2021-01-28

Family

ID=74194293

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/042555 WO2021015709A1 (en) 2019-07-19 2019-07-19 A system and method for constructing mathematical models of technological processes and training these models

Country Status (1)

Country Link
WO (1) WO2021015709A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116907764A (en) * 2023-09-14 2023-10-20 国能龙源环保有限公司 Method, device, equipment and storage medium for detecting air tightness of desulfurization equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090138415A1 (en) * 2007-11-02 2009-05-28 James Justin Lancaster Automated research systems and methods for researching systems
US20110011595A1 (en) * 2008-05-13 2011-01-20 Hao Huang Modeling of Hydrocarbon Reservoirs Using Design of Experiments Methods
US20120323343A1 (en) * 2011-06-15 2012-12-20 Caterpillar Inc. Virtual sensor system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090138415A1 (en) * 2007-11-02 2009-05-28 James Justin Lancaster Automated research systems and methods for researching systems
US20110011595A1 (en) * 2008-05-13 2011-01-20 Hao Huang Modeling of Hydrocarbon Reservoirs Using Design of Experiments Methods
US20120323343A1 (en) * 2011-06-15 2012-12-20 Caterpillar Inc. Virtual sensor system and method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116907764A (en) * 2023-09-14 2023-10-20 国能龙源环保有限公司 Method, device, equipment and storage medium for detecting air tightness of desulfurization equipment
CN116907764B (en) * 2023-09-14 2023-12-26 国能龙源环保有限公司 Method, device, equipment and storage medium for detecting air tightness of desulfurization equipment

Similar Documents

Publication Publication Date Title
Medford et al. Extracting knowledge from data through catalysis informatics
Marquardt Model-based experimental analysis of kinetic phenomena in multi-phase reactive systems
Ge et al. A comparative study of just-in-time-learning based methods for online soft sensor modeling
Poyton et al. Parameter estimation in continuous-time dynamic models using principal differential analysis
Bhatt et al. Incremental identification of reaction systems—A comparison between rate-based and extent-based approaches
Ricardez-Sandoval Optimal design and control of dynamic systems under uncertainty: A probabilistic approach
Ge et al. Quality prediction for polypropylene production process based on CLGPR model
Bradley et al. Perspectives on the integration between first-principles and data-driven modeling
Rizkin et al. Combining automated microfluidic experimentation with machine learning for efficient polymerization design
Proppe et al. Mechanism deduction from noisy chemical reaction networks
Liu et al. Active selection of informative data for sequential quality enhancement of soft sensor models with latent variables
Yuan et al. Probabilistic density-based regression model for soft sensing of nonlinear industrial processes
do Valle et al. Collection of benchmark test problems for data reconciliation and gross error detection and identification
Zendehboudi et al. A dual approach for modelling and optimisation of industrial urea reactor: Smart technique and grey box model
Quaglio et al. An online reparametrisation approach for robust parameter estimation in automated model identification platforms
Wei et al. Detailed kinetic models for catalytic reforming
Lorenzi et al. Local-metrics error-based Shepard interpolation as surrogate for highly non-linear material models in high dimensions
Fu et al. Physics-data combined machine learning for parametric reduced-order modelling of nonlinear dynamical systems in small-data regimes
Çıtmacı et al. Digitalization of an experimental electrochemical reactor via the smart manufacturing innovation platform
Bagajewicz et al. Reallocation and upgrade of instrumentation in process plants
Alighardashi et al. Expectation maximization approach for simultaneous gross error detection and data reconciliation using Gaussian mixture distribution
WO2021015709A1 (en) A system and method for constructing mathematical models of technological processes and training these models
Hematillake et al. Design and optimization of a penicillin fed-batch reactor based on a deep learning fault detection and diagnostic model
CN108647485A (en) Catalyst carbon deposition measurement method, system, medium and equipment in fluid catalytic cracking
Chen et al. Efficient JITL framework for nonlinear industrial chemical engineering soft sensing based on adaptive multi-branch variable scale integrated convolutional neural networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19938973

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 030522)

122 Ep: pct application non-entry in european phase

Ref document number: 19938973

Country of ref document: EP

Kind code of ref document: A1