CN117797489A

CN117797489A - Reboiler switching control method and device, electronic equipment and storage medium

Info

Publication number: CN117797489A
Application number: CN202311853059.0A
Authority: CN
Inventors: 田睿杰; 陈雄辉; 俞扬
Original assignee: Nanqi Xiance Nanjing High Tech Co ltd
Current assignee: Nanqi Xiance Nanjing High Tech Co ltd
Priority date: 2023-12-28
Filing date: 2023-12-28
Publication date: 2024-04-02

Abstract

The invention discloses a reboiler switching control method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring operation process data corresponding to the refining tower system to be controlled in the operation process of the refining tower system to be controlled; processing operation process data based on a pre-trained reboiler opening prediction model to obtain a first opening variation corresponding to the hot water reboiler at the current moment and a second opening variation corresponding to the steam reboiler at the current moment; and processing the first opening degree variation and the second opening degree variation based on a preset opening degree value processing mode. According to the technical scheme, the effect of accurately controlling the opening changes of the hot water reboiler and the steam reboiler in the operation process of the refining tower system is achieved, and the effect of reducing the labor cost, guaranteeing the stability of the refining tower system and improving the switching control efficiency between the hot water reboiler and the steam reboiler is achieved.

Description

Reboiler switching control method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of chemical equipment, in particular to a reboiler switching control method and device, electronic equipment and a storage medium.

Background

The refining tower is mainly used for refining high-quality products such as fuel oil and lubricating oil from crude oil, and the operating principle of the refining tower is mainly to separate the products by utilizing the boiling point difference of different components in the crude oil. In order to ensure that the quality of the separated product meets the requirement, the hot water reboiler can be used on the basis of stable operation of the refining tower system, and the steam reboiler is gradually closed in the process of using the hot water reboiler so as to reduce the steam consumption.

In the related art, a feed forward technique is generally used to control the hot water reboiler and the steam reboiler. The feed forward technique is to control the reboiler opening by a PID controller.

However, feed forward techniques typically require the creation of complex mathematical models, which not only incur high costs, but also limit their flexibility and portability between different scenarios. If the dynamic model of the system changes, the PID parameters that have been previously debugged cannot be used. In addition, most PID parameters are basically adjusted based on manual parameter adjustment or expert system, and the manual parameter adjustment and expert system are dependent on a large amount of manual parameter selection and field knowledge, so that the usability, flexibility and universality of the PID controller are greatly reduced.

Disclosure of Invention

The invention provides a reboiler switching control method, a device, electronic equipment and a storage medium, which are used for realizing the effect of accurately controlling the opening changes of a hot water reboiler and a steam reboiler in the operation process of a refining tower system, and achieving the effects of reducing the labor cost, ensuring the stability of the refining tower system and improving the switching control efficiency between the hot water reboiler and the steam reboiler.

According to an aspect of the present invention, there is provided a reboiler switching control method comprising:

acquiring operation process data corresponding to a refining tower system to be controlled in the operation process of the refining tower system to be controlled, wherein the operation process data comprises a tower kettle temperature value sequence, a purity value sequence of each separation object, a first opening value sequence of a hot water reboiler, a second opening value sequence of a steam reboiler and a liquid flow value sequence of each target pipeline, wherein the tower kettle temperature value sequence corresponds to at least one historical moment at the current moment and before the current moment;

processing the operation process data based on a pre-trained reboiler opening prediction model to obtain a first opening variation corresponding to the hot water reboiler at the current moment and a second opening variation corresponding to the steam reboiler at the current moment; wherein the reboiler opening degree prediction model is trained based on a reinforcement learning algorithm;

And processing the first opening variable quantity and the second opening variable quantity based on a preset opening value processing mode.

According to another aspect of the present invention, there is provided a reboiler switching control apparatus comprising:

a process data acquisition module, configured to acquire operation process data corresponding to a to-be-controlled refining tower system during operation of the to-be-controlled refining tower system, where the operation process data includes a tower kettle temperature value sequence, a purity value sequence of each separation object, a first opening value sequence of a hot water reboiler, a second opening value sequence of a steam reboiler, and a liquid flow value sequence of each target pipeline, where the tower kettle temperature value sequence corresponds to at least one historical time before the current time;

the opening change amount determining module is used for processing the running process data based on a reboiler opening prediction model obtained through training in advance to obtain a first opening change amount corresponding to the hot water reboiler at the current moment and a second opening change amount corresponding to the steam reboiler at the current moment; wherein the reboiler opening degree prediction model is trained based on a reinforcement learning algorithm;

The opening change amount processing module is used for processing the first opening change amount and the second opening change amount based on a preset opening value processing mode.

According to another aspect of the present invention, there is provided an electronic apparatus including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the reboiler switching control method of any one of the embodiments of the present invention.

According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement the reboiler switching control method according to any one of the embodiments of the present invention when executed.

According to the technical scheme, the operation process data corresponding to the refining tower system to be controlled is obtained in the operation process of the refining tower system to be controlled, further, the operation process data is processed based on the reboiler opening prediction model obtained through pre-training, the first opening variation corresponding to the hot water reboiler at the current moment and the second opening variation corresponding to the steam reboiler at the current moment are obtained, finally, the first opening variation and the second opening variation are processed based on the preset opening value processing mode, the problem that the accurate prediction of the opening of the reboiler cannot be carried out and the control process is complex in the related art is solved, the effect of accurately controlling the opening variation of the hot water reboiler and the steam reboiler in the operation process of the refining tower system is achieved, and the effects of reducing labor cost and guaranteeing stability of the refining tower system and improving the switching control efficiency between the hot water reboiler and the steam reboiler are achieved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a reboiler switching control method according to a first embodiment of the present invention;

FIG. 2 is a flow chart of a reboiler switching control method according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of a reboiler switching control device according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device for implementing the reboiler switching control method according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

Fig. 1 is a flowchart of a reboiler switching control method according to an embodiment of the present invention, where the method may be performed by a reboiler switching control device, and the reboiler switching control device may be implemented in hardware and/or software, and the reboiler switching control device may be configured in a terminal and/or a server. As shown in fig. 1, the method includes:

S110, acquiring operation process data corresponding to the refining tower system to be controlled in the operation process of the refining tower system to be controlled.

It should be noted that the technical scheme of the embodiment of the invention can be applied to the scene of switching control of the reboiler of the refining tower system. When the refining tower system stably operates, the hot water reboiler is in a completely closed state, and the steam reboiler is in a completely opened state. In order to improve the product separation efficiency of the refining tower system and reduce the production cost of a factory on the premise of ensuring the continuous and stable operation of the refining tower system, the hot water reboiler can be put into use when the refining tower system is in stable operation, namely, the hot water reboiler is gradually opened, and meanwhile, the steam reboiler is gradually closed until the hot water reboiler is completely opened and the steam reboiler is completely closed. Therefore, the stable operation state of the refining tower system can be ensured, and the separated product meets the quality requirement.

In this embodiment, the refining column system to be controlled may be understood as a refining column system to be controlled. The polishing column system may be a system for separation and precision of gasoline and hydrocarbon compounds. The refining column system may include a refining column, at least one reboiler, at least one conduit, and other equipment, among others. The refining column is usually a vertical column vessel with packing material packed therein with gaps between the packing materials. The refining tower has the functions of absorbing, adsorbing, separating out matter, etc. inside the tower and separating out different components from the mixture with the stuffing inside the tower. The reboiler is a device for vaporizing the liquid again. The materials are heated and expanded in the reboiler, even vaporized and have smaller density, so that the materials leave the vaporization space and smoothly return to the tower, the gas phase and the liquid phase in the tower return to each other, the gas phase upwards passes through the tower tray, and the liquid phase can fall to the bottom of the tower. The pipes in the refining column system may be pipes connected to the refining column, which may include pipes that can both withdraw liquid from the column and release liquid back into the column, pipes that can only withdraw liquid from the column, etc. The operation process data is understood to be data generated during operation of the system of refining columns to be controlled. The operational process data may include operational data corresponding to each device in the system of refining columns to be controlled. The operation process data may be sequence data with each data acquisition time as a node. The operation process data comprises a tower bottom temperature value sequence corresponding to at least one historical moment at the current moment and before the current moment, a purity value sequence of each separation object, a first opening value sequence of a hot water reboiler, a second opening value sequence of a steam reboiler and a liquid flow value sequence of each target pipeline.

In this embodiment, for the whole process of operation of the refining tower system to be controlled, the operation process may be divided according to a preset data acquisition frequency, and a plurality of data acquisition moments may be obtained, where these data acquisition moments may be used as respective moments. Furthermore, for each moment, under the condition that the current moment is reached, the operation process data can be acquired, and the operation process data value corresponding to the current moment is obtained. The tower kettle temperature value sequence can comprise tower kettle temperature values corresponding to all moments. The temperature of the column bottom is understood to be the temperature of the column bottom in the refining column. A separate object is understood to be an object that is separated from the mixed liquid. For example, assuming that the mixed liquid to be separated is crude oil, the separation object may be a product of fuel oil, solvent oil, lubricating oil, grease, paraffin, asphalt, liquefied gas, or the like. The purity value sequence may include purity values corresponding to respective times. The purity value may be a purity corresponding to the respective classified object. The hot water reboiler may be a reboiler provided in the system of the refining column to be controlled. The first opening value sequence may include a first opening value corresponding to each time. The first opening value is the opening angle value of the hot water reboiler valve. The steam reboiler may be a reboiler provided in the system of the refining column to be controlled. The second opening value sequence may include a second opening value corresponding to each time. The second opening value is the opening angle value of the steam reboiler valve. The target conduit may be a conduit associated with a reboiler switching control process in a system of columns to be controlled. The target conduit may be at least part of all conduits provided in the refining column system to be controlled. The sequence of liquid flow values may include liquid flow rates corresponding to respective times.

In practical application, the data analysis can be performed on the operation process of the refining tower system to be controlled, so that the opening degrees of the hot water reboiler and the steam reboiler arranged in the refining tower system to be controlled are controlled based on the data analysis result. Therefore, in the operation process of the refining tower system to be controlled, the operation data at each moment can be collected, and further, the operation process data corresponding to the refining tower system to be controlled can be obtained.

And S120, processing the operation process data based on a pre-trained reboiler opening prediction model to obtain a first opening variation corresponding to the hot water reboiler at the current moment and a second opening variation corresponding to the steam reboiler at the current moment.

Wherein, reboiler aperture predictive model is based on reinforcement learning algorithm training.

In this embodiment, the reboiler opening degree prediction model may be understood as a strategy network determined based on a reinforcement learning algorithm. Those skilled in the art will appreciate that the policy network may be a deep neural network model deploying a policy function, its corresponding input object may be a state of the environment in which the agent is located, and its corresponding output object may be a decision action determined according to the entered state. The reboiler opening degree prediction model may be a neural network model of any structure. Alternatively, the model structure of the reboiler opening degree prediction model may include, but is not limited to, a gated recurrent neural network (Gated Recurrent Neural Network, GRU) and a Long Short term memory network (Long Short-Term Memory network, LSTM). The first opening degree variation amount may be understood as an increase amount or a decrease amount of the first opening degree value, i.e., an opening degree value that is increased or decreased on the basis of the first opening degree value corresponding to the present moment. Similarly, the second opening value variation amount may be understood as an increase amount or a decrease amount of the second opening value, i.e., an opening value that is increased or decreased on the basis of the first opening value corresponding to the present moment.

In practical application, in order to predict valve opening values of the hot water reboiler and the steam reboiler, after operation process data corresponding to the refining tower system to be controlled is obtained, the obtained operation process data can be input into a reboiler opening prediction model. Furthermore, the operation process data can be processed based on the reboiler opening prediction model, and a first opening variation corresponding to the hot water reboiler at the current moment and a second opening variation corresponding to the steam reboiler at the current moment are obtained. The first opening degree variation amount and the second opening degree variation amount may be amounts having both directions and magnitudes, and the directions may be expressed by "+" and "—" respectively.

In the reboiler switching, the direction of reboiler parameter change needs to be restricted to be monotonous, the valve opening of the hot water reboiler is only increased and not decreased, and the valve opening of the steam reboiler is only decreased and not increased. That is, the change direction of the first opening degree change amount corresponding to the hot water reboiler and the change direction of the second opening degree change amount corresponding to the steam reboiler are opposite, that is, if the hot water reboiler is a hot water reboiler, the change direction of the first opening degree value is increased, and the first opening degree change amount is an increase amount; at this time, the steam reboiler is a steam reboiler, the changing direction of the second opening value is decreasing, and the second opening changing amount is decreasing.

S130, processing the first opening degree variation and the second opening degree variation based on a preset opening degree value processing mode.

In this embodiment, the preset opening value processing manner may be a manner that is preset and processes the opening variation determined based on the model. The preset opening value processing mode can be any processing mode, and can be indirectly controlled, visually displayed, directly controlled or the like.

In practical application, after the first opening degree variation corresponding to the hot water reboiler at the current moment and the second opening degree variation corresponding to the steam reboiler at the current moment are obtained, the first opening degree variation and the second opening degree variation can be processed according to a preset opening degree value processing mode.

In this embodiment, the preset opening value processing modes may include a plurality of modes, and the corresponding opening variation processing procedures are different for different processing modes, and these opening value processing modes will be described below.

Optionally, the preset opening value processing mode includes indirect control and visual display, and the processing of the first opening variation and the second opening variation based on the preset data processing mode includes: the first opening degree variation amount and the second opening degree variation amount are visually displayed based on the display interface of the target terminal.

In this embodiment, the target terminal may be understood as a terminal device for processing operation process data, may be understood as a device for deploying a reboiler opening prediction model, and may be understood as a terminal device to which a target user belongs, which is not specifically limited in this embodiment. Wherein the target user may be a technician for detecting the refining column system to be controlled. Alternatively, the target terminal may be a mobile terminal or a PC terminal, etc.

In practical application, after the first opening degree variation corresponding to the hot water reboiler at the current moment and the second opening degree variation corresponding to the steam reboiler at the current moment are obtained, the first opening degree variation and the second opening degree variation can be visually displayed based on the display interface of the target terminal.

Optionally, the preset opening value processing mode includes direct control, and the processing of the first opening variation and the second opening variation based on the preset data processing mode includes: controlling the hot water reboiler to adjust the opening based on the first opening variation, and obtaining a first opening value corresponding to the hot water reboiler at the next moment; and controlling the steam reboiler to adjust the opening based on the second opening variation, and obtaining a second opening value corresponding to the steam reboiler at the next moment.

In this embodiment, direct control is understood to mean direct regulation of the reboiler valve opening, i.e. regulation of the reboiler valve opening without human intervention.

In practical application, after the first opening variation corresponding to the hot water reboiler at the current moment is obtained, the hot water reboiler can be controlled to perform opening adjustment based on the first opening variation. Further, the valve opening of the hot water reboiler is adjusted according to the change direction and the change amount in the first opening change amount. Further, a first opening value corresponding to the hot water reboiler at the next moment can be obtained. And after obtaining the second opening degree variation corresponding to the current moment of the steam reboiler, controlling the steam reboiler to adjust the opening degree based on the second opening degree variation. Further, the valve opening of the steam reboiler is adjusted according to the changing direction and the changing amount in the second opening changing amount. Further, a second opening value corresponding to the steam reboiler at the next time can be obtained.

In the case where the preset opening value processing mode is direct control, it may be further determined whether the opening value obtained after the opening adjustment reaches the preset control threshold before the opening adjustment is performed on the corresponding reboiler based on the opening variation. If the opening variation is not reached, the opening of the corresponding reboiler can be adjusted directly based on the opening variation; if the control mode is reached, the preset opening value processing mode can be switched to indirect control, so that the system stability of the refining tower system to be controlled can be ensured.

Based on the above, the above technical means further includes: under the condition that a preset opening value processing mode is direct control, determining a first opening value corresponding to the hot water reboiler at the current moment based on a first opening value sequence, and determining a first opening value corresponding to the hot water reboiler at the next moment based on a first opening variation and the first opening value; and determining a second opening value corresponding to the steam reboiler at the current moment based on the second opening value sequence, and determining a second opening value corresponding to the steam reboiler at the next moment based on the second opening variation and the second opening value; and when the condition that the first opening value corresponding to the next moment reaches the first preset control threshold value and/or the second opening value corresponding to the next moment reaches the second preset control threshold value is detected, switching the preset opening value processing mode into indirect control and visual display.

Example two

Fig. 2 is a flowchart of a reboiler switching control method according to a second embodiment of the present invention, where, on the basis of the foregoing embodiment, before operation process data is processed based on a reboiler opening degree prediction model, a simulation training sample may be constructed, and the reboiler opening degree prediction model may be trained based on the simulation training sample. The specific implementation manner can be seen in the technical scheme of the embodiment. Wherein, the technical terms identical or similar to those of the above embodiments are not repeated herein.

As shown in fig. 2, the method includes:

s210, for each training round, acquiring a simulation online training sample corresponding to the refining tower system to be controlled under the current training round.

Wherein the simulated online training samples are determined based on a simulated environmental model corresponding to the refining tower system to be controlled.

In this embodiment, the simulation environment model may be understood as a model for characterizing the running condition of the to-be-controlled refining tower system, or may be understood as a model obtained by modeling the real running environment of the to-be-controlled refining tower system. The environment modeling can be a simulation modeling mode in the reinforcement learning field, namely, the real environment is abstracted into a computable model, so that an intelligent agent can understand and predict links. The environment model may include a state transition function (state transition model) and a reward function. For the present embodiment, the simulation environment model may be a model established for better planning of the operation condition of the refining tower system to be controlled. The training rounds may be understood as a training process constructed from an initial state of the simulation environment model until a preset maximum number of training rounds is reached. One training round may be understood as a training process, after each training round is completed, the simulation environment model may be initialized back to the initial state for the next training round. Further, after training for a plurality of training rounds, and detecting convergence of the loss function in the model, a trained model can be obtained. For this embodiment, the initial state may be that the refining column system to be controlled is operating smoothly, the hot water reboiler is closed, and the steam reboiler is fully open.

In this embodiment, the simulated online training sample may be understood as sample data collected in real time during the running process of the simulated environment model corresponding to the to-be-controlled rectifying tower system, that is, the data included in the simulated online training sample is online data. The simulation online training sample comprises a sample tower kettle temperature value sequence corresponding to at least one historical moment at the current moment and before the current moment, a sample purity value sequence of each separation object, a first sample opening value sequence of a hot water reboiler, a second sample opening value sequence of a steam reboiler, a sample liquid flow value sequence of each target pipeline and a historical rewarding feedback information sequence corresponding to at least one historical moment.

In this embodiment, the sample tower kettle temperature value sequence may include sample tower kettle temperature values corresponding to each time. The temperature value of the sample tower kettle is the temperature of the tower kettle of the refining tower in the simulation environment model. The sample purity value sequence may include sample purity values corresponding to respective times. The sample purity value is the purity value of the corresponding separation object in the simulation environment model. The first sample aperture value sequence may include a first sample aperture value corresponding to each time. The first sample opening value is the opening value of the hot water reboiler in the simulation environment model. The second sample opening value sequence may include a second sample opening value corresponding to each time. The second sample opening value is the opening value of the steam reboiler in the simulation environment model. The sequence of sample fluid flow values may include sample fluid flow values corresponding to respective times. The sample liquid flow value is the flow value of the liquid in the corresponding target pipeline in the simulation environment model. The historical rewards feedback information sequence may include rewards feedback information corresponding to a plurality of moments. The rewarding feedback information is a numerical value obtained by the agent after executing an action, and can represent the quality of the action. The bonus feedback information may be determined based on a bonus function deployed in the simulation environment model. It can be appreciated by those skilled in the art that in the reinforcement learning field, a reinforcement learning algorithm can be used to train the policy network, so that the trained policy network can judge the current state corresponding to the agent, so as to obtain the target decision action at the current time. For example, the policy network may be trained using a flexible actor-reviewer algorithm, which may include a review network (e.g., a status value network and/or an action status value network) and a policy network. Wherein the evaluation network may be a "critique", The method does not directly take action, but evaluates the quality of the action; the policy network may be an "actor" for determining decision actions based on the entered states. The strategy network is a neural network model, which can directly predict the strategy which should be executed at present by observing the environment state, and the strategy can obtain the maximum expected benefit when being executed, and the expected benefit can be taken as a reward. Exemplary, in the policy network training process, the state s corresponding to the time t is obtained _t Will s _t Is input into a policy network, which may determine a decision action a based on the input state _t For s _t Implementing decision action a _t Obtaining a new state, namely a state s corresponding to the time t+1 _t+1 And obtain the rewards r corresponding to the time t _t . Further, based on rewards r _t Parameter correction can be performed on the judgment network and the strategy network in reinforcement learning to obtain the strategy network with the training completed.

In practical application, firstly, scene initialization can be performed so that a simulation environment model corresponding to a refining tower system to be controlled is in an initial state, namely, the system runs stably, a hot water reboiler is closed, and a steam reboiler is completely opened. Further, in the operation process of the simulation environment model, the generated operation data are collected. Furthermore, the temperature values of the sample tower kettles of the refining tower system to be controlled corresponding to all moments can be obtained, and a sample tower kettles temperature value sequence is constructed based on the collected temperature values of the sample tower kettles; meanwhile, sample purity values corresponding to all the separation objects at all the moments can be obtained, and sample purity value sequences corresponding to all the separation objects can be obtained; meanwhile, a first sample opening value corresponding to the hot water reboiler at each moment can be obtained, and a first sample opening value sequence corresponding to the hot water reboiler is obtained; meanwhile, a second sample opening value corresponding to the steam reboiler at each time can be obtained, and a second sample opening value sequence corresponding to the steam reboiler is obtained; meanwhile, the sample liquid flow value corresponding to each item mark pipeline at each moment can be obtained, and the sample liquid flow value sequence corresponding to each item mark pipeline can be obtained.

Further, for the interaction between the simulation environment model and the hot water reboiler and the steam reboiler, after the hot water reboiler and the steam reboiler execute the action of changing the opening value once, the reward feedback information corresponding to the action can be obtained based on the reward function which is deployed in advance in the simulation environment model. Further, a historical rewards feedback information sequence may be constructed based on the corresponding rewards feedback information at each time instant.

S220, training a reboiler opening degree prediction model based on each simulation on-line training sample and a reinforcement learning algorithm.

In practical application, the reboiler opening degree prediction model can be trained based on simulated online training samples and reinforcement learning algorithms corresponding to each training round, so that the reboiler opening degree prediction model after training of a plurality of training rounds can obtain the reboiler opening degree prediction model after final training is completed.

Optionally, training the reboiler opening degree prediction model based on each simulated online training sample and the reinforcement learning algorithm includes: for each training round, inputting a sample tower kettle temperature value sequence, each sample purity value sequence, a first sample opening value sequence, a second sample opening value sequence and each sample liquid flow value sequence in a corresponding simulated online training sample under the current training round into a reboiler opening prediction model to obtain actual variable probability distribution corresponding to a hot water reboiler and a steam reboiler at the current moment; for each first opening degree variation and each second opening degree variation in the actual variation probability distribution, inputting the current first opening degree variation, the current second opening degree variation, a sample tower kettle temperature value sequence, each sample purity value sequence, a first sample opening degree value sequence, a second sample opening degree value sequence and each sample liquid flow value sequence into a state action value model to obtain an expected reward corresponding to the current moment; according to the probability distribution of the actual variation, each expected reward and a strategy gradient algorithm, carrying out parameter updating on a reboiler opening prediction model; according to the probability distribution of the actual variation, determining a first actual opening variation corresponding to the hot water reboiler at the current moment and a second actual opening variation corresponding to the steam reboiler at the current moment; processing a sample tower kettle temperature value sequence, each sample purity value sequence, a first sample opening value sequence, a second sample opening value sequence, each sample liquid flow value sequence, a first actual opening change amount and a second actual opening change amount according to a preset multi-index reward function, determining reward feedback information corresponding to a refining tower system to be controlled at the current moment, and updating a historical reward feedback information sequence based on the reward feedback information; according to the rewarding feedback information and the time sequence difference algorithm, updating parameters of the state action value model; and finishing training when the detection reaches a preset training target corresponding to the reboiler opening degree prediction model, and obtaining the reboiler opening degree prediction model.

The actual variable quantity probability distribution comprises probability information of each first opening variable quantity corresponding to the hot water reboiler and probability information of each second opening variable quantity corresponding to the steam reboiler. The state action value network may be understood as a deep neural network that takes a state at a current time and a decision action at the current time as input objects to evaluate the decision action taken in the state at the current time. The state action value network may be a neural network that includes a state action value function. The input of the state action value network may be the state at the current time and the decision action at the current time, and the output may be the value corresponding to the decision action taken for the current state, i.e., the expected reward corresponding to the decision action at the current time. The strategy Gradient (Policy Gradient) algorithm is an algorithm for solving the reinforcement learning problem, is a Gradient-based optimization algorithm, and can help a machine learning model to optimize in a decision environment so as to obtain an optimal result. The idea of the strategy gradient algorithm is to express the strategy as a continuous function related to rewards, and then find the optimal strategy by using the optimization method of the continuous function, wherein the optimization target is to maximize the continuous function. The multi-index reward function includes an observed-index safety reward function, an observed-index stability reward function, and a control-index reward function. Alternatively, the multi-index reward function may be determined by a weighted summation based on the observed-index safety reward function, the observed-index stability reward function, and the control-index reward function. The observation index comprises a sample tower kettle temperature value, a sample purity value, a first sample opening value, a second sample opening value and a sample liquid flow value. The control index includes a first actual opening degree variation amount and a second actual opening degree variation amount. The time series differential (Temporal Difference, TD) algorithm is an algorithm used to estimate the cost function of a strategy that can be adaptively adjusted by learning the difference between the current state and the future state. The core idea of the TD algorithm is the update of the state value function. The preset training target may be a preset policy network training process end condition. Optionally, the preset training target may include that the objective function value corresponding to the strategy gradient algorithm is the largest or the current training iteration number reaches the preset number.

In practical application, for each training round, model parameters in the reboiler opening prediction model and the state action value model can be updated based on the above process. Further, a reboiler opening degree prediction model after training is obtained.

In practical application, for each training round, a sample tower kettle temperature value sequence, each sample purity value sequence, a first sample opening value sequence, a second sample opening value sequence and each sample liquid flow value sequence in the corresponding simulated online training sample under the current training round can be input into a reboiler opening prediction model to obtain the practical variation probability distribution corresponding to the hot water reboiler and the steam reboiler at the current moment. And then, inputting the current first opening variation, the current second opening variation, the sample tower bottom temperature value sequence, the sample purity value sequence, the first sample opening value sequence, the second sample opening value sequence and the sample liquid flow value sequence into a state action value model for each first opening variation and each second opening variation in the actual variation probability distribution, and obtaining an expected reward corresponding to the current moment. Further, according to the probability distribution of the actual variation, each expected reward and a strategy gradient algorithm, the reboiler opening prediction model is subjected to parameter updating.

And sampling the actual variable quantity probability distribution, and determining a first actual opening variable quantity corresponding to the hot water reboiler at the current moment and a second actual opening variable quantity corresponding to the steam reboiler at the current moment from the first opening variable quantity and the second opening variable quantity included in the actual variable quantity probability distribution.

Further, processing a sample tower kettle temperature value sequence, each sample purity value sequence, a first sample opening value sequence, a second sample opening value sequence, each sample liquid flow value sequence, a first actual opening change amount and a second actual opening change amount according to a preset reward function, determining reward feedback information corresponding to a refining tower system to be controlled at the current moment, and updating a historical reward feedback information sequence based on the reward feedback information. And then, updating parameters of the state action value model according to the rewarding feedback information and the time sequence difference algorithm.

And finally, finishing training when the detection reaches a preset training target corresponding to the reboiler opening degree prediction model, and obtaining the reboiler opening degree prediction model.

S230, acquiring operation process data corresponding to the refining tower system to be controlled in the operation process of the refining tower system to be controlled.

S240, processing operation process data based on a pre-trained reboiler opening prediction model to obtain a first opening change amount corresponding to the hot water reboiler at the current moment and a second opening change amount corresponding to the steam reboiler at the current moment.

S250, processing the first opening degree variation and the second opening degree variation based on a preset opening degree value processing mode.

Example III

Fig. 3 is a schematic structural diagram of a reboiler switching control device according to a third embodiment of the present invention. As shown in fig. 3, the apparatus includes: a process data acquisition module 310, an opening degree variation determination module 320, and an opening degree variation processing module 330.

The process data obtaining module 310 is configured to obtain, during operation of a refining tower system to be controlled, operation process data corresponding to the refining tower system to be controlled, where the operation process data includes a tower kettle temperature value sequence, a purity value sequence of each separation object, a first opening value sequence of a hot water reboiler, a second opening value sequence of a steam reboiler, and a liquid flow value sequence of each target pipeline, where the tower kettle temperature value sequence corresponds to at least one historical time before a current time and the current time; the opening variation determining module 320 is configured to process the operation process data based on a pre-trained reboiler opening prediction model, so as to obtain a first opening variation corresponding to the hot water reboiler at the current time and a second opening variation corresponding to the steam reboiler at the current time; wherein the reboiler opening degree prediction model is trained based on a reinforcement learning algorithm; the opening variable processing module 330 is configured to process the first opening variable and the second opening variable based on a preset opening value processing manner.

Optionally, the preset opening value processing manner includes indirect control and visual display, and the opening variation processing module 330 includes: and a visual display unit.

And the visual display unit is used for visually displaying the first opening degree variation and the second opening degree variation based on a display interface of the target terminal.

Optionally, the preset opening value processing manner includes direct control, and the opening variation processing module 330 includes: a hot water reboiler aperture adjusting unit and a steam reboiler aperture adjusting unit.

The hot water reboiler opening adjusting unit is used for controlling the hot water reboiler to adjust the opening based on the first opening variation, and obtaining a first opening value corresponding to the hot water reboiler at the next moment; the method comprises the steps of,

and the steam reboiler opening adjusting unit is used for controlling the steam reboiler to adjust the opening based on the second opening variation and obtaining a second opening value corresponding to the steam reboiler at the next moment.

Optionally, the apparatus further includes: the device comprises a first opening value determining module, a second opening value determining module and a processing mode switching module.

The first opening value determining module is used for determining a first opening value corresponding to the hot water reboiler at the current moment based on the first opening value sequence and determining a first opening value corresponding to the hot water reboiler at the next moment based on the first opening variation and the first opening value under the condition that the preset opening value processing mode is direct control; the method comprises the steps of,

A second opening value determining module, configured to determine a second opening value corresponding to the steam reboiler at the current moment based on the second opening value sequence, and determine a second opening value corresponding to the steam reboiler at the next moment based on the second opening variation and the second opening value;

and the processing mode switching module is used for switching the processing mode of the preset opening value into indirect control and visual display under the condition that the first opening value corresponding to the next moment reaches a first preset control threshold value and/or the second opening value corresponding to the next moment reaches a second preset control threshold value.

Optionally, the apparatus further includes: the training sample acquisition module and the model training module.

The training sample acquisition module is used for acquiring simulation online training samples corresponding to the refining tower system to be controlled under the current training round for each training round, wherein the simulation online training samples are determined based on a simulation environment model corresponding to the refining tower system to be controlled, and the simulation online training samples comprise sample tower kettle temperature value sequences corresponding to the current moment and at least one historical moment before the current moment, sample purity value sequences of all separation objects, first sample opening value sequences of a hot water reboiler, second sample opening value sequences of a steam reboiler, sample liquid flow value sequences of all target pipelines and historical rewarding feedback information sequences corresponding to the at least one historical moment;

And the model training module is used for training the reboiler opening degree prediction model based on each simulation online training sample and the reinforcement learning algorithm.

Optionally, the model training module includes: the device comprises a probability distribution determining unit, a first expected reward determining unit, a second expected reward determining unit, a reboiler opening prediction model updating unit, an actual opening change amount determining unit, a reward feedback information determining unit, a state action value model updating unit and a reboiler opening prediction model determining unit.

The probability distribution determining unit is used for inputting a sample tower kettle temperature value sequence, a sample purity value sequence, a first sample opening value sequence, a second sample opening value sequence and a sample liquid flow value sequence in a simulation online training sample corresponding to each training round to the reboiler opening prediction model to obtain actual variable quantity probability distribution corresponding to the hot water reboiler and the steam reboiler at the current moment, wherein the actual variable quantity probability distribution comprises probability information of each first opening variable quantity corresponding to the hot water reboiler and probability information of each second opening variable quantity corresponding to the steam reboiler;

The expected rewards determining unit is used for inputting the current first opening change amount, the current second opening change amount, the sample tower kettle temperature value sequence, the sample purity value sequence, the first sample opening value sequence, the second sample opening value sequence and the sample liquid flow value sequence into the state action value model for each first opening change amount and each second opening change amount in the actual change amount probability distribution, so as to obtain the expected rewards corresponding to the current moment;

the reboiler opening prediction model updating unit is used for updating parameters of the reboiler opening prediction model according to the actual variable probability distribution, the expected rewards and the strategy gradient algorithm; the method comprises the steps of,

an actual opening change amount determining unit, configured to determine, according to the actual change amount probability distribution, a first actual opening change amount corresponding to the hot water reboiler at the current time and a second actual opening change amount corresponding to the steam reboiler at the current time;

the rewarding feedback information determining unit is used for processing the sample tower kettle temperature value sequence, each sample purity value sequence, the first sample opening value sequence, the second sample opening value sequence, each sample liquid flow value sequence, the first actual opening variable quantity and the second actual opening variable quantity according to a preset multi-index rewarding function, determining rewarding feedback information corresponding to the refining tower system to be controlled at the current moment, and updating the historical rewarding feedback information sequence based on the rewarding feedback information;

The state action value model updating unit is used for updating parameters of the state action value model according to the rewarding feedback information and a time sequence difference algorithm;

and the reboiler opening prediction model determining unit is used for obtaining the reboiler opening prediction model after training is finished when the reboiler opening prediction model reaches a preset training target corresponding to the reboiler opening prediction model.

Optionally, the multi-index reward function includes an observation index safety reward function, an observation index stability reward function, and a control index reward function, the observation index includes a sample tower kettle temperature value, a sample purity value, a first sample opening value, a second sample opening value, and a sample liquid flow value, and the control index includes a first actual opening variation and a second actual opening variation.

The reboiler switching control device provided by the embodiment of the invention can execute the reboiler switching control method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Example IV

Fig. 4 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.

Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as reboiler switching control methods.

In some embodiments, the reboiler switching control method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the reboiler switching control method described above may be performed. Alternatively, in other embodiments, processor 11 may be configured to perform the reboiler switching control method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. A reboiler switching control method, comprising:

2. The method according to claim 1, wherein the preset opening value processing means includes indirect control and visual display, and the processing the first opening change amount and the second opening change amount based on the preset data processing means includes:

and visually displaying the first opening degree variation and the second opening degree variation based on a display interface of the target terminal.

3. The method according to claim 1, wherein the preset opening value processing means includes direct control, and the processing the first opening variation and the second opening variation based on the preset data processing means includes:

Controlling the hot water reboiler to adjust the opening based on the first opening variation, and obtaining a first opening value corresponding to the hot water reboiler at the next moment; the method comprises the steps of,

and controlling the steam reboiler to adjust the opening based on the second opening variation, and obtaining a second opening value corresponding to the steam reboiler at the next moment.

4. A method according to claim 3, further comprising:

under the condition that the preset opening value processing mode is direct control, determining a first opening value corresponding to the hot water reboiler at the current moment based on the first opening value sequence, and determining a first opening value corresponding to the hot water reboiler at the next moment based on the first opening variation and the first opening value; the method comprises the steps of,

determining a second opening value corresponding to the steam reboiler at the current moment based on the second opening value sequence, and determining a second opening value corresponding to the steam reboiler at the next moment based on the second opening variation and the second opening value;

and when the condition that the first opening value corresponding to the next moment reaches a first preset control threshold value and/or the second opening value corresponding to the next moment reaches a second preset control threshold value is detected, switching the preset opening value processing mode into indirect control and visual display.

5. The method as recited in claim 1, further comprising:

for each training round, obtaining a simulation online training sample corresponding to the refining tower system to be controlled under the current training round, wherein the simulation online training sample is determined based on a simulation environment model corresponding to the refining tower system to be controlled, and comprises a sample tower kettle temperature value sequence corresponding to the current moment and at least one historical moment before the current moment, a sample purity value sequence of each separation object, a first sample opening value sequence of a hot water reboiler, a second sample opening value sequence of a steam reboiler, a sample liquid flow value sequence of each target pipeline and a historical rewarding feedback information sequence corresponding to the at least one historical moment;

and training the reboiler opening degree prediction model based on each simulation online training sample and a reinforcement learning algorithm.

6. The method of claim 5, wherein the training the reboiler opening degree prediction model based on each of the simulated online training samples and a reinforcement learning algorithm comprises:

for each training round, inputting a sample tower kettle temperature value sequence, each sample purity value sequence, a first sample opening value sequence, a second sample opening value sequence and each sample liquid flow value sequence in a corresponding simulated online training sample under the current training round into the reboiler opening prediction model to obtain actual variable quantity probability distribution corresponding to the hot water reboiler and the steam reboiler at the current moment, wherein the actual variable quantity probability distribution comprises probability information of each first opening variable quantity corresponding to the hot water reboiler and probability information of each second opening variable quantity corresponding to the steam reboiler;

For each first opening degree variation and each second opening degree variation in the actual variation probability distribution, inputting a current first opening degree variation, a current second opening degree variation, the sample tower kettle temperature value sequence, each sample purity value sequence, a first sample opening degree value sequence, a second sample opening degree value sequence and each sample liquid flow rate value sequence into a state action value model to obtain an expected reward corresponding to the current moment;

according to the actual variable probability distribution, each expected reward and a strategy gradient algorithm, carrying out parameter updating on the reboiler opening prediction model; the method comprises the steps of,

determining a first actual opening change amount corresponding to the hot water reboiler at the current moment and a second actual opening change amount corresponding to the steam reboiler at the current moment according to the actual change amount probability distribution;

processing the sample tower kettle temperature value sequence, each sample purity value sequence, a first sample opening value sequence, a second sample opening value sequence, each sample liquid flow value sequence, the first actual opening change amount and the second actual opening change amount according to a preset multi-index reward function, determining reward feedback information corresponding to the refining tower system to be controlled at the current moment, and updating the historical reward feedback information sequence based on the reward feedback information;

According to the reward feedback information and a time sequence difference algorithm, updating parameters of the state action value model;

and finishing training when the reboiler opening degree prediction model reaches a preset training target corresponding to the reboiler opening degree prediction model, and obtaining the reboiler opening degree prediction model.

7. The method of claim 6, wherein the multi-index reward function comprises an observation index safety reward function, an observation index stability reward function, and a control index reward function, the observation index comprising a sample tower kettle temperature value, a sample purity value, a first sample opening value, a second sample opening value, and a sample liquid flow value, the control index comprising a first actual opening change amount and a second actual opening change amount.

8. A reboiler switching control apparatus comprising:

9. An electronic device, the electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the reboiler switching control method of any one of claims 1-7.

10. A computer readable storage medium storing computer instructions for causing a processor to implement the reboiler switching control method of any one of claims 1-7 when executed.