CN114507881A - Model-free self-learning stability control method for electrolyte temperature in zinc electrolysis process - Google Patents

Model-free self-learning stability control method for electrolyte temperature in zinc electrolysis process Download PDF

Info

Publication number
CN114507881A
CN114507881A CN202210277803.6A CN202210277803A CN114507881A CN 114507881 A CN114507881 A CN 114507881A CN 202210277803 A CN202210277803 A CN 202210277803A CN 114507881 A CN114507881 A CN 114507881A
Authority
CN
China
Prior art keywords
temperature
electrolyte
state
model
action
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210277803.6A
Other languages
Chinese (zh)
Other versions
CN114507881B (en
Inventor
阳春华
刘天豪
周灿
朱红求
李勇刚
李繁飙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202210277803.6A priority Critical patent/CN114507881B/en
Publication of CN114507881A publication Critical patent/CN114507881A/en
Application granted granted Critical
Publication of CN114507881B publication Critical patent/CN114507881B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C25ELECTROLYTIC OR ELECTROPHORETIC PROCESSES; APPARATUS THEREFOR
    • C25CPROCESSES FOR THE ELECTROLYTIC PRODUCTION, RECOVERY OR REFINING OF METALS; APPARATUS THEREFOR
    • C25C1/00Electrolytic production, recovery or refining of metals by electrolysis of solutions
    • C25C1/16Electrolytic production, recovery or refining of metals by electrolysis of solutions of zinc, cadmium or mercury
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Landscapes

  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Electrochemistry (AREA)
  • Materials Engineering (AREA)
  • Metallurgy (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Electrolytic Production Of Non-Metals, Compounds, Apparatuses Therefor (AREA)
  • Electrolytic Production Of Metals (AREA)

Abstract

The embodiment of the disclosure provides a model-free self-learning stability control method for electrolyte temperature in a zinc electrolysis process, which belongs to the technical field of chemistry and specifically comprises the following steps: establishing an environment interaction model, a reward mechanism and a Q table corresponding to a Q learning algorithm, setting a target interval of the electrolyte temperature needing to be controlled, and initializing parameters needed by updating the Q table; defining the state space and the action space of the electrolyte in the zinc electrolysis process; defining a Q table, wherein the horizontal axis represents optional actions, and the vertical axis represents the type of a state space; updating the Q table according to data generated by interaction of the agent and the environment interaction model; and according to the updated Q table, obtaining a stability control model corresponding to the temperature of the electrolyte in the zinc electrolysis process, and outputting the optimal cooling tower fan frequency corresponding to the current electrolyte state according to the stability control model. Through the scheme disclosed by the invention, the temperature of the zinc electrolyte is automatically ensured to be always within the required range of the process, and the control efficiency, adaptability and stability of the zinc electrolysis process production are improved.

Description

Model-free self-learning stability control method for electrolyte temperature in zinc electrolysis process
Technical Field
The embodiment of the disclosure relates to the technical field of chemistry, in particular to a model-free self-learning stability control method for electrolyte temperature in a zinc electrolysis process.
Background
At present, an important process of a zinc hydrometallurgy process of zinc electrolysis refers to a process that zinc ions in electrolyte flow from an anode plate to a cathode plate under the action of direct current so as to be separated out to form a zinc simple substance on the cathode plate. Since it consumes a large amount of electric energy, it affects the production costs of the zinc smelter to a large extent. The temperature of the electrolyte is one of the key parameters for controlling the electrolytic process, and has important influence on the efficiency and quality of the separated zinc. In order to ensure efficient precipitation of zinc, the outlet temperature of the electrolyte is required to be within a suitable range. The excessive temperature of the electrolyte can reduce the overvoltage of hydrogen evolution, so that the reverse dissolution of the evolved zinc is intensified, and the current efficiency is reduced; too low electrolyte temperature also causes a decrease in zinc precipitation efficiency and a deterioration in electrolytic effect.
The electrolyte solution circulating system has the characteristics of large circulating flow and uninterrupted circulation, and for a zinc smelting enterprise with the yield of 30 ten thousand tons in a year, the volume of the electrolyte solution in an electrolytic cell reaches thousands of cubic meters, and the solution is circulated for 24 hours without interruption, so that the system is a typical large-scale circulating flow system. Moreover, because electrolyte is deposited in open air environment, ambient temperature influences electrolyte temperature very greatly, because day and night, four seasons ambient temperature change is obvious, leads to in different time quantums accurate control electrolyte temperature very difficult.
In order to keep the temperature of the electrolyte within a proper range, the current practice of enterprises is to install a mechanical ventilation type cooling tower in an electrolyte circulation system, and cool the temperature of the electrolyte by blowing low-temperature air outside a workshop into an electrolyte circulation pipeline. According to whether the frequency conversion device is installed on the cooling tower fan or not, the control strategy of the cooling tower fan can be divided into a frequency conversion strategy and a non-frequency conversion strategy. For the variable-frequency cooling tower, the cooling performance of the cooling tower can be adjusted by adjusting the frequency of the fan; for a non-variable frequency cooling tower, the cooling performance of the cooling tower can be adjusted by adjusting the opening time of a fan of the cooling tower.
According to the industrial standard, the temperature of the electrolyte is properly ranged from 35 ℃ to 40 ℃. Taking a variable-frequency cooling tower as an example, under a high-temperature environment, the fan frequency of the cooling tower needs to be increased, and the cooling performance of the cooling tower is improved; in a low-temperature environment, the fan frequency of the cooling tower needs to be reduced, and the cooling performance of the cooling tower needs to be reduced. If the cooling tower is not provided with the frequency conversion device, the operation time of the cooling tower is increased in high-temperature seasons and the operation time of the cooling tower is reduced in low-temperature seasons similarly to the frequency conversion cooling tower. Thereby ensuring that the temperature of the electrolyte is within a proper range. Aiming at the variable-frequency cooling tower, the method realizes the adjustment of the cooling performance of the cooling tower by changing the frequency of a fan of the cooling tower, and finally realizes the temperature adjustment of the electrolyte.
At present, a control strategy of a zinc electrolysis enterprise for a cooling tower depends on manual experience, when the temperature difference between day and night is large, the fan frequency of the cooling tower needs to be adjusted frequently by manpower, the labor intensity is high, and due to the fact that the temperature of electrolyte is measured manually, temperature feedback information lags, serious lag and instability exist in the temperature control of the electrolyte based on the manual experience.
Therefore, a model-free self-learning stability control method for the electrolyte temperature in the zinc electrolysis process is needed, which can automatically and effectively realize the stability control of the electrolyte temperature in the zinc electrolysis process all the time, ensure that the electrolyte temperature is always in the process requirement range, ensure that zinc ions in the electrolyte can be efficiently separated out at the cathode, and reduce the labor intensity of workers.
Disclosure of Invention
In view of this, the disclosed embodiments provide a model-free self-learning stability control method for electrolyte temperature in a zinc electrolysis process, which at least partially solves the problems of poor automaticity, adaptability, control efficiency and stability in the prior art.
The embodiment of the disclosure provides a model-free self-learning stability control method for electrolyte temperature in a zinc electrolysis process, which comprises the following steps:
step 1, establishing an environment interaction model, a reward mechanism and a Q table corresponding to a Q learning algorithm, setting a target interval of which the electrolyte temperature needs to be controlled, and initializing parameters needed by updating the Q table, wherein the parameters comprise a discount factor, a learning rate and a random factor;
step 2, defining a state space and an action space of electrolyte in the zinc electrolysis process, wherein the action space is the fan frequency of a cooling tower;
step 3, defining the Q table, wherein the horizontal axis represents optional actions, and the vertical axis represents the types of state spaces, wherein the state spaces comprise four variables of ambient dry bulb temperature, ambient wet bulb temperature, ambient relative humidity and electrolyte temperature, and the number of the types is the combined number of the four variables;
step 4, updating the Q table according to data generated by interaction of the agent and the environment interaction model;
and 5, obtaining a stability control model corresponding to the temperature of the electrolyte in the zinc electrolysis process according to the updated Q table, and outputting the optimal cooling tower fan frequency corresponding to the current electrolyte state according to the stability control model.
According to a specific implementation manner of the embodiment of the disclosure, the environment interaction model is built by adopting a BP neural network, and the input parameter is the fan frequency f at the moment ttowerAmbient dry bulb temperature T at time TdryAmbient wet bulb temperature T at time TwetAmbient relative humidity RH at time T, electrolyte temperature T at time TelecElectrolyte temperature T at time T +1 as output parameterelec
According to a specific implementation manner of the embodiment of the present disclosure, before the step 4, the method further includes:
defining the upper limit and the lower limit of the controlled temperature as a preset interval;
and setting the control target of the electrolyte temperature in the interaction process of the intelligent agent and the environment interaction model in the preset interval.
According to a specific implementation manner of the embodiment of the present disclosure, the calculation manner of the reward mechanism is
Figure BDA0003556712830000031
According to a specific implementation manner of the embodiment of the present disclosure, the step 4 specifically includes:
step 4.1, setting an initial state s, wherein step is 0;
step 4.2, action selection: randomly taking a value of a random factor rand, and if the rand is greater than 0, selecting an action a with the maximum Q value in a state s; if rand is 0, randomly selecting one state from all states, and selecting the action a with the maximum Q value in the state;
step 4.3, the agent of the agent inputs the action a into the environment interaction model env to obtain a new state s' at the next moment;
step 4.4, all actions in s ' are checked, and the action with the maximum Q value in s ' is taken as a ';
step 4.5, judging whether the iteration process is qualified: obtaining the temperature T 'of the electrolyte at the next moment according to the new state s' at the next moment, and if T is metmin≤T'≤TmaxIf yes, the iteration process is qualified, the reward is 0, the step 4.6 is continued, and if T is not metmin≤T'≤TmaxIf so, the iteration process fails, the reward is-1, and the step 4.1 is returned again;
step 4.6, updating the Q value of the current action a by using the following formula:
Figure BDA0003556712830000041
step 4.7, making s ═ s ', a ═ a', step ═ step +1, returning to step 4.1, continuing to circulate, defining 4.3-4.7 as a step, and the process of interaction between the agent and the environment interaction model env is the stepmaxA process of updating the Q table;
step 4.8, step is added up to the user-specified value stepmaxAnd the Q table updating step is finished.
The model-free self-learning stability control scheme for the temperature of the electrolyte in the zinc electrolysis process in the embodiment of the disclosure comprises the following steps: step 1, establishing an environment interaction model, a reward mechanism and a Q table corresponding to a Q learning algorithm, setting a target interval of which the electrolyte temperature needs to be controlled, and initializing parameters needed by updating the Q table, wherein the parameters comprise a discount factor, a learning rate and a random factor; step 2, defining a state space and an action space of electrolyte in the zinc electrolysis process, wherein the action space is the fan frequency of a cooling tower; step 3, defining the Q table, wherein the horizontal axis represents optional actions, and the vertical axis represents the types of state spaces, wherein the state spaces comprise four variables of ambient dry bulb temperature, ambient wet bulb temperature, ambient relative humidity and electrolyte temperature, and the number of the types is the combined number of the four variables; step 4, updating the Q table according to data generated by interaction of the agent and the environment interaction model; and 5, obtaining a stability control model corresponding to the temperature of the electrolyte in the zinc electrolysis process according to the updated Q table, and outputting the optimal cooling tower fan frequency corresponding to the current electrolyte state according to the stability control model.
The beneficial effects of the embodiment of the disclosure are: through the scheme of this disclosure, with electrolyte temperature, the dry ball temperature of environment, the wet ball temperature of environment, environment relative humidity is as zinc electrolysis process electrolyte temperature state space, set for the action space with cooling tower fan frequency, and the data update Q table that produces according to the mutual process of intelligent agent and environment interaction model, finally obtain zinc electrolysis process electrolyte temperature stability control model, solved current technique because electrolyte circulation flow is big, the circulation is incessant, the unstable problem of electrolyte temperature control that the environmental temperature parameter is changeable leads to, thereby guarantee the temperature of zinc electrolyte is in the requirement within range of technology all the time, promote the stability of zinc electrolysis technology production.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method for controlling the stability of zinc electrolysis in a model-free self-learning manner in an electrolyte temperature environment according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of an electrolyte environment interaction model based on BP network fitting according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of an interaction relationship between another agent and an environment interaction model according to an embodiment of the present disclosure;
FIG. 4 is a flow chart of an embodiment of a zinc electrolysis process electrolyte temperature model-free self-learning stability control method according to an embodiment of the present disclosure;
fig. 5 is a flowchart of an electrolyte temperature model-free self-learning stability control method in a zinc electrolysis process according to a second embodiment of the present disclosure;
fig. 6 is a schematic diagram for comparing effects of a method and artificial experiences in the second embodiment of the present disclosure.
Detailed Description
The embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.
The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present disclosure, and the drawings only show the components related to the present disclosure rather than the number, shape and size of the components in actual implementation, and the type, amount and ratio of the components in actual implementation may be changed arbitrarily, and the layout of the components may be more complicated.
In addition, in the following description, specific details are provided to facilitate a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.
The embodiment of the disclosure provides a model-free self-learning stability control method for electrolyte temperature in a zinc electrolysis process, which can be applied to the electrolyte temperature stability control process in the zinc electrolysis process in chemical and metallurgical scenes.
Referring to fig. 1, a schematic flow chart of a model-free self-learning stabilization method for electrolyte temperature in a zinc electrolysis process according to an embodiment of the present disclosure is provided. As shown in fig. 1, the method mainly comprises the following steps:
step 1, establishing an environment interaction model, a reward mechanism and a Q table corresponding to a Q learning algorithm, setting a target interval of which the electrolyte temperature needs to be controlled, and initializing parameters needed by updating the Q table, wherein the parameters comprise a discount factor, a learning rate and a random factor;
optionally, as shown in fig. 2, the environment interaction model may be built by using a BP neural network, and the input parameter is the fan frequency f at time ttowerAmbient dry bulb temperature T at time TdryAmbient wet bulb temperature T at time TwetAmbient relative humidity RH at time T, electrolyte temperature T at time TelecThe temperature of the electrolyte at the time when the output parameter is t +1Degree Telec
Optionally, the reward mechanism is calculated in a manner that
Figure BDA0003556712830000071
In specific implementation, an environment interaction model env and an incentive mechanism r adopted by a Q learning control algorithm can be established, and a target interval C ═ T where the electrolyte temperature needs to be controlled is setmin,TmaxAnd initializing parameters required by updating the Q table, such as a discount factor gamma, a learning rate alpha, a random factor rand and the like. Specifically, the environment interaction model env is built by adopting a BP neural network, and input parameters are as follows: fan frequency f at time ttowerAmbient dry bulb temperature T at time TdryAmbient wet bulb temperature T at time TwetAmbient relative humidity RH at time T, electrolyte temperature T at time Telec. The output parameters are: electrolyte temperature T at time T +1elec
The calculation mode of the reward mechanism r is as follows:
Figure BDA0003556712830000072
step 2, defining a state space and an action space of electrolyte in the zinc electrolysis process, wherein the action space is the fan frequency of a cooling tower;
in specific implementation, the state space and the action space of the electrolyte in the zinc electrolysis process need to be defined, wherein the state space comprises the ambient dry bulb temperature TdryAmbient air wet bulb temperature TwetRelative humidity of the environment RH, temperature of the electrolyte TelecFour variables, the action space is the fan frequency f of the cooling towertower
Step 3, defining the Q table, wherein the horizontal axis represents optional actions, and the vertical axis represents the types of state spaces, wherein the state spaces comprise four variables of ambient dry bulb temperature, ambient wet bulb temperature, ambient relative humidity and electrolyte temperature, and the number of the types is the combined number of the four variables;
in one embodiment, the Q-table is an m by n matrix, where the horizontal direction represents the selectable activity (cooling tower fan frequency) [ a ]1,a2,a3,...,an]Where n is the number of types of selectable actions and the vertical axis is the different state space [ S ]1,S2,S3,...,Sm]The number m of which depends on the ambient dry bulb temperature TdryAmbient air wet bulb temperature TwetRelative humidity of the environment RH, temperature of the electrolyte TelecNumber of combinations of four variable permutation combinations: the temperature of the environment dry bulb is m1In one case, the ambient wet bulb temperature is divided into m2In one case, the relative humidity is divided into m3In one case, the electrolyte temperature is divided into m4In one case, the last state is that the electrolyte is out of specification, so the number of states m equals m1×m2×m3×m4+1。
Step 4, updating the Q table according to data generated by interaction of the agent and the environment interaction model;
optionally, before step 4, the method further includes:
defining the upper limit and the lower limit of the controlled temperature as a preset interval;
and setting the control target of the electrolyte temperature in the interaction process of the intelligent agent and the environment interaction model in the preset interval.
Further, the step 4 specifically includes:
step 4.1, setting an initial state s, wherein step is 0;
step 4.2, action selection: randomly taking a value of a random factor rand, and if the rand is greater than 0, selecting an action a with the maximum Q value in a state s; if rand is 0, randomly selecting one state from all states, and selecting the action a with the maximum Q value in the state;
step 4.3, the agent of the agent inputs the action a into the environment interaction model env to obtain a new state s' at the next moment;
step 4.4, all actions in s ' are checked, and the action with the maximum Q value in s ' is taken as a ';
step 4.5, judging the iteration processWhether qualified or not: obtaining the temperature T 'of the electrolyte at the next moment according to the new state s' at the next moment, and if T is metmin≤T'≤TmaxIf yes, the iteration process is qualified, the reward is 0, the step 4.6 is continued, and if T is not metmin≤T'≤TmaxIf so, the iteration process fails, the reward is-1, and the step 4.1 is returned again;
step 4.6, updating the Q value of the current action a by using the following formula:
Figure BDA0003556712830000081
step 4.7, making s ═ s ', a ═ a', step ═ step +1, returning to step 4.1, continuing to circulate, defining 4.3-4.7 as a step, and the process of interaction between the agent and the environment interaction model env is the stepmaxA process of updating the Q table;
step 4.8, step is added up to the user-specified value stepmaxAnd the Q table updating step is finished.
In specific implementation, as shown in fig. 3, the data [ s ] generated by continuous interaction between the agent of the agent and the zinc electrolyte environment interaction model envt,at,rt,st+1]. Wherein s istIs the zinc electrolysis state at time t, i.e. the above-mentioned state space, at is the tower fan frequency at time t, i.e. the above-mentioned operating space, rtIs the reward at time t, st+1Is the zinc electrolysis state at the moment t +1, and the agent of the agent inputs the reward r according to the current moment ttAnd state stOutputting the action a at the next momentt+1The environment interaction model env inputs an action a according to the current time ttAnd state stOutputting the state s of the next momentt+1The training process of the agent is actually that the agent continuously interacts with the environment interaction model to generate data st,at,rt,st+1]And thus update the contents of the Q table. After meeting the training requirement, Q table content is fixed, and the intelligent agent has also learned the electrolyte temperature control method that accords with the requirement, specifically: does not need to specially enter the agent of the intelligent agentModeling is carried out, the training process is the process that the agent of the agent and the environment interaction model env continuously interact and the Q table is corrected, and the process does not need manual control.
Meanwhile, data [ s ] are generated by interaction between the agent of the agent and the zinc electrolyte environment interaction model envt,at,rt,st+1]Before training the Q table, the following needs to be defined: the upper and lower limits of the controlled temperature are defined, that is, the required electrolyte temperature must be within a predetermined range, and C ═ T is definedmin,TmaxAnd in the interaction process of the agent and the zinc electrolyte environment interaction model env, controlling the temperature of the electrolyte to be controlled to be always within an interval C, namely the temperature to be controlled is not lower than Tmin and not higher than Tmax.
The Q learning algorithm aims at obtaining a value function corresponding to state-action and is represented by Q (s, a), wherein s is the state, a is the action, the learning aim of the Q-learning algorithm is to learn m multiplied by n Q values of a Q table, the learning process is to obtain an environment reward r by continuously interacting with an environment model env so as to form a Q value corresponding to the state-action pair in the Q table, and the Q value in the Q table is continuously and iteratively modified through a Q value updating rule. The updating rule of the Q value is as follows:
(1) since the Q table is initialized, the horizontal axis of Q represents n candidate operations, and the vertical axis represents m states, the table has m × n Q values, and all Q values are 0 in the initial stage.
(2) Defining a random factor Rand with a random number value between 0 and 1, wherein the Rand is updated every time step (7) is executed.
(3) Setting a discount factor gamma, wherein the value of the discount factor gamma ranges from 0 to 1, and specifying a specific value by a user.
(4) Setting a learning rate alpha, wherein the value is between 0 and 1, and specifying a specific value by a user.
(5) Setting the maximum step of the iteration stepmaxThe specific value is specified by the user.
(6) The calculation mode of the reward r is set.
(7) An initial state s is set, step being 0.
(8) Selecting actions: if rand >0, selecting action a with the maximum Q value in the state s; and if rand is 0, randomly selecting one state from all the states, and selecting the action a with the maximum Q value in the state.
(9) And the agent of the agent inputs the action a into the environment interaction model env to obtain a new state s' at the next moment.
(10) All actions in s ' are checked to see which action has the largest Q value under s ', and the action is assumed to be a '.
(11) Judging whether the iteration process is qualified: obtaining the temperature T 'of the electrolyte at the next moment according to the new state s' at the next moment, if T is metmin≤T'≤TmaxIf yes, the iteration process is qualified, the reward is 0, and the step (12) is continued; otherwise, if T is not satisfiedmin≤T'≤TmaxIf the result is positive, the iteration process fails, the reward is-1, and the step (7) is returned again.
(12) Updating the Q value of the current action a by using the following formula:
(13)
Figure BDA0003556712830000101
(14) and (5) returning to the step (7) by making s 'and a' and step +1, and continuing to loop, wherein (9) to (14) are defined as a step, and the process of interaction between the agent and the environment model env is the stepmaxThe process of updating the Q table is repeated.
(15) step is added to the user specified value stepmaxAnd the Q table updating step is finished.
And 5, obtaining a stable control model corresponding to the temperature of the electrolyte in the zinc electrolysis process according to the updated Q table, and outputting the optimal fan frequency of the cooling tower corresponding to the current electrolyte state according to the stable control model.
During specific implementation, after the Q table is updated, a stability control model corresponding to the temperature of the electrolyte in the zinc electrolysis process can be obtained according to the updated Q table, and the optimal cooling tower fan frequency corresponding to the current electrolyte state is output according to the stability control model so as to complete the stability control process.
The zinc electrolysis process electrolyte temperature model-free self-learning stability control method provided by the embodiment includes the steps of taking the electrolyte temperature, the environment dry bulb temperature, the environment wet bulb temperature and the environment relative humidity as the zinc electrolysis process electrolyte temperature state space, setting the cooling tower fan frequency as the action space, updating a Q table according to data generated in the interaction process of an intelligent agent and an environment interaction model, and finally obtaining the zinc electrolysis process electrolyte temperature stability control model.
This solution will be explained below with reference to two embodiments.
Example one
Referring to fig. 4, a method for controlling the stability of the electrolyte temperature in the zinc electrolysis process by model-free self-learning provided by the first embodiment of the invention comprises the following steps:
step S101: establishing an environment interaction model env and an award r mechanism adopted by a Q learning control algorithm, and setting a target interval C (T) of the electrolyte temperature to be controlledmin,TmaxAnd initializing parameters required by updating the Q table, such as a discount factor gamma, a learning rate alpha, a random factor rand and the like.
Step S102: defining a state space and an action space of electrolyte in the zinc electrolysis process, wherein the state space comprises the temperature T of the electrolyteelecAmbient dry bulb temperature TdryAmbient wet bulb temperature TwetRH4 variables, i.e. S ═ Tdry,Twet,RH,Telec]The action space is the fan frequency f of the cooling towertower
Step S103: defining a Q table, wherein the horizontal axis represents optional actions, the vertical axis represents the type of the state space, and the quantity m of the type of the state space depends on the combined quantity of four variable permutation combinations of the ambient dry bulb temperature, the ambient wet bulb temperature, the ambient relative humidity and the electrolyte temperature: the temperature of the environment dry bulb is m1In one case, the ambient wet bulb temperature is divided intom2In one case, the relative humidity is divided into m3In one case, the electrolyte temperature is divided into m4In one case, the last state is that the electrolyte is out of specification, so the number of states m equals m1×m2×m3×m4+1。
Step S104: and updating the Q table according to data generated by the intelligent agent and environment interaction model, thereby obtaining the electrolyte temperature stability control model in the zinc electrolysis process. The updating process comprises the following steps:
(1) an initial state s is set, step being 0.
(2) Selecting actions: randomly taking a value of a random factor rand, and if the rand is greater than 0, selecting an action a with the maximum Q value in a state s; and if rand is 0, randomly selecting one state from all the states, and selecting the action a with the maximum Q value in the state.
(3) And the agent of the agent inputs the action a into the environment interaction model env to obtain a new state s' at the next moment.
(4) All actions in s ' are checked to see which action has the largest Q value under s ', and the action is assumed to be a '.
(5) Judging whether the iteration process is qualified: obtaining the temperature T 'of the electrolyte at the next moment according to the new state s' at the next moment, if T is metmin≤T'≤TmaxIf yes, the iteration process is qualified, the reward is 0, and the step (6) is continued; otherwise, if T is not satisfiedmin≤T'≤TmaxIf so, the iteration process fails, the reward is-1, and the step (1) is returned again.
(6) Updating the Q value of the current action a by using the following formula:
(7)
Figure BDA0003556712830000111
(8) and (3) returning to the step (3) by letting s ', a', step +1, and continuing to loop, wherein (3) to (8) are defined as a step, and the process of interaction between the agent and the environment model env is the stepmaxThe process of updating the Q table.
(9) step is added to the user specified value stepmaxThe Q table is updatedAnd (4) bundling.
Step S105: and obtaining an electrolyte temperature stability control model in the zinc electrolysis process according to the updated Q table, and outputting the optimal cooling tower fan frequency corresponding to the current electrolyte state.
The electrolyte temperature no-model self-learning stability control method for the zinc electrolysis process provided by the invention has the advantages that the electrolyte temperature, the environment dry bulb temperature, the environment wet bulb temperature and the environment relative humidity are used as the electrolyte temperature state space of the zinc electrolysis process, the fan frequency of the cooling tower is set as the action space, the Q table is updated according to the data generated in the interaction process of the intelligent body and the environment interaction model, and finally the electrolyte temperature stability control model for the zinc electrolysis process is obtained.
Specifically, the embodiment of the invention innovatively introduces a Q learning reinforcement learning algorithm idea, the self-defined zinc electrolyte temperature space is electrolyte temperature, the environment dry bulb temperature, the environment wet bulb temperature and the relative humidity, the self-defined action space is cooling tower fan frequency, a chemical environment interaction model based on a BP network is established, and data are continuously obtained through interaction between the intelligent body and the environment interaction model, so that a Q table is updated, the intelligent body can realize model-free autonomous learning to a new electrolyte temperature stability control strategy, the zinc electrolyte temperature is guaranteed to be always within a process requirement range, and the production stability of a zinc electrolysis process is improved.
Example two
Referring to fig. 5, a model-free self-learning stability control method for electrolyte temperature in a zinc electrolysis process according to a second embodiment of the present invention includes:
step S101: establishing an environment interaction model, a Q table and a reward mechanism adopted by a Q learning control algorithm, and setting a target interval C (T) of the electrolyte temperature to be controlledmin,TmaxInitializing parameters needed by Q table updating, such as discount factor gamma, learningRate alpha, random factor rand, etc. The concrete contents are as follows:
and establishing an environment interaction model env adopted by a Q learning algorithm, wherein the model is established by using a BP network, and input parameters are the electrolyte temperature at the moment t, the environment dry bulb temperature at the moment t, the environment wet bulb temperature at the moment t, the environment relative humidity at the moment t and the cooling tower fan frequency at the moment t (the value is given by an intelligent agent). The output parameter is the electrolyte temperature at the time t + 1. The number of network hidden layers is 20.
Defining a reward mechanism, outputting a state s according to an environment interaction model envt+1Determining the value of the reward r, specifically:
Figure BDA0003556712830000131
defining a target interval C ═ T where the temperature of the electrolyte solution needs to be controlledmin,TmaxWhere T ismin=37,Tmax=40。
Step S102: defining a state space and an action space of electrolyte in the zinc electrolysis process, wherein the state space comprises the temperature T of the electrolyteelecAmbient dry bulb temperature TdryAmbient wet bulb temperature TwetRH4 variables, i.e. S ═ Tdry,Twet,RH,Telec]The action space is the fan frequency f of the cooling towertower
Step S103: establishing a Q table, wherein the horizontal axis is the selectable frequency of a fan of the cooling tower and is divided into 14 types, namely 24Hz, 26Hz, 28Hz, 30Hz, 32Hz, 34Hz, 36Hz, 38Hz, 40Hz, 42Hz, 44Hz, 46Hz, 48Hz and 50 Hz; the vertical axis represents 109 states of the electrolyte space, and the total of 3 × 3 × 3 × 4+1 is 109 states, which are arranged and combined according to different values of four parameters, namely, the electrolyte temperature, the ambient dry bulb temperature, the ambient wet bulb temperature, and the relative humidity. They are respectively:
(1) if the temperature of the electrolyte exceeds 40 ℃ or is lower than 37 ℃, entering a 109 th state, and meanwhile, indicating that the step fails, the intelligent agent needs to clear all the contents of the Q table and restart the updating of the Q value. Wherein T iselecIs the temperature of the electrolyteDegree, box is the state name of the 109 th state
if(Telec<37||Telec>40)
box=109;
else
(2) The temperature of the electrolyte is changed into three grades, 3 cases in total, wherein TelecIs the electrolyte temperature, TelecThe _ Bucket is a mark for dividing the temperature state of the electrolyte
Figure BDA0003556712830000132
(3) The temperature of the ambient dry bulb is changed into three grades, 3 cases in total, wherein TdryIs the ambient dry bulb temperature, TdryThe socket is a mark for dividing the temperature state of the environmental dry bulb
Figure BDA0003556712830000141
(4) The relative humidity is divided into three grades, and 3 cases are provided, wherein RH is the relative humidity of the environment, and RH _ Bucket is an identifier for dividing the relative humidity state of the environment
Figure BDA0003556712830000142
(5) The temperature of the environmental wet bulb is changed into four grades, and the total temperature is 4 cases, wherein TwetIs the ambient wet bulb temperature, TwetThe socket is a mark for dividing the temperature state of the environmental wet bulb
Figure BDA0003556712830000143
Parameters required for initializing the Q table update, such as the discount factor γ being 0.9 and the learning rate α being 0.5, and the random factor rand is updated at each step, which is in the range of 0 to 1.
Step S104: and updating the Q table according to data generated by the intelligent agent and environment interaction model, thereby obtaining the electrolyte temperature stability control model in the zinc electrolysis process. The updating process comprises the following steps:
(1) an initial state s is set, step being 0.
(2) Selecting actions: randomly taking a value of a random factor rand, and if the rand is greater than 0, selecting an action a with the maximum Q value in a state s; and if rand is 0, randomly selecting one state from all the states, and selecting the action a with the maximum Q value in the state.
(3) And the agent of the agent inputs the action a into the environment interaction model env to obtain a new state s' at the next moment.
(4) All actions in s ' are checked to see which action has the largest Q value under s ', and the action is assumed to be a '.
(5) Judging whether the iteration process is qualified: obtaining the temperature T 'of the electrolyte at the next moment according to the new state s' at the next moment, if T is metmin≤T'≤TmaxIf yes, the iteration process is qualified, the reward is 0, and the step (12) is continued; otherwise, if T is not satisfiedmin≤T'≤TmaxIf yes, the iteration process fails, the reward is-1, and the step (7) is returned again
(6) Updating the Q value of the current action a by using the following formula:
(7)
Figure BDA0003556712830000151
(8) and (4) returning to the step (3) by setting s to s ', a to a', step to step +1, and continuing to loop, wherein (3) to (8) are defined as one step, and the process of interaction between the agent of the environment is stepmaxThe process of updating the Q table is repeated.
(9) step is added to the user specified value stepmaxAnd the Q table updating step is finished.
Step S105: and obtaining an electrolyte temperature stability control model in the zinc electrolysis process according to the updated Q table, and outputting the optimal cooling tower fan frequency corresponding to the current electrolyte state.
The temperature of the existing electrolyte depends on manual experience, and workers set the frequency of a fan of a cooling tower according to the environmental temperature of the day, so that the labor intensity is high. And the electrolyte temperature of the electrolytic plant is still manually measured at present, namely the electrolyte temperature is off-line, so that the manual temperature control method has certain hysteresis.
Compared with the prior art, the model-free self-learning electrolyte temperature stability control method does not need to specially model the electrolyte temperature control process, only needs to collect the on-site environment temperature data and the electrolyte temperature data, builds the environment interaction model based on the BP network, obtains the updated Q table through the continuous interaction of the intelligent agent and the environment interaction model, can realize the model-free self-adaption stability control of the electrolyte temperature, omits the modeling link, and can realize the good electrolyte temperature stability control effect.
FIG. 6 is a comparison of the temperature control effect of the on-site manual control method and the present method, wherein (a) of FIG. 6 is a temperature curve of the electrolyte solution for the electrolytic plant using manual empirical control, wherein the horizontal axis is time, the interval is 2 hours, the vertical axis is the temperature of the electrolyte solution, and the two horizontal lines in the graph are the interval limits of the minimum temperature 37 and the maximum temperature 40, respectively; FIG. 6 (b) is a graph of the temperature of the electrolyte controlled by the present method, wherein the horizontal axis represents time, the interval is 2 hours, the vertical axis represents the temperature of the electrolyte, and the two horizontal lines in the graph represent the interval limits of the minimum temperature 37 and the maximum temperature 40, respectively; through calculation, the average value of the manually controlled temperature is 37.797, the standard deviation is 1.16, the average value of the manually controlled temperature is 38.584, and the standard deviation is 0.545. it can be seen that the temperature curve under the control of the method meets the process requirements of 37-40, and meanwhile, compared with manual control, the standard deviation is smaller, which indicates that the temperature control is smoother and more stable.
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof.
The above description is only for the specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present disclosure should be covered within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (5)

1. A model-free self-learning stability control method for the temperature of electrolyte in the zinc electrolysis process is characterized by comprising the following steps:
step 1, establishing an environment interaction model, a reward mechanism and a Q table corresponding to a Q learning algorithm, setting a target interval of which the electrolyte temperature needs to be controlled, and initializing parameters needed by updating the Q table, wherein the parameters comprise a discount factor, a learning rate and a random factor;
step 2, defining a state space and an action space of electrolyte in the zinc electrolysis process, wherein the action space is the fan frequency of a cooling tower;
step 3, defining the Q table, wherein the horizontal axis represents optional actions, and the vertical axis represents the types of state spaces, wherein the state spaces comprise four variables of ambient dry bulb temperature, ambient wet bulb temperature, ambient relative humidity and electrolyte temperature, and the number of the types is the combined number of the four variables;
step 4, updating the Q table according to data generated by interaction of the agent and the environment interaction model;
and 5, obtaining a stability control model corresponding to the temperature of the electrolyte in the zinc electrolysis process according to the updated Q table, and outputting the optimal cooling tower fan frequency corresponding to the current electrolyte state according to the stability control model.
2. The method according to claim 1, wherein the environment interaction model is built by adopting a BP neural network, and the input parameter is the fan frequency f at the moment ttowerAmbient dry bulb temperature T at time TdryAmbient wet bulb temperature T at time TwetAmbient relative humidity RH at time T, electrolyte temperature T at time TelecElectrolyte temperature T at time T +1 as output parameterelec
3. The method of claim 1, wherein prior to step 4, the method further comprises:
defining the upper limit and the lower limit of the controlled temperature as a preset interval;
and setting the control target of the electrolyte temperature in the interaction process of the intelligent agent and the environment interaction model in the preset interval.
4. The method of claim 1, wherein the reward mechanism is calculated as
Figure FDA0003556712820000011
5. The method according to claim 1, wherein the step 4 specifically comprises:
step 4.1, setting an initial state s, wherein step is 0;
step 4.2, action selection: randomly taking a value of a random factor rand, and if the rand is greater than 0, selecting an action a with the maximum Q value in a state s; if rand is 0, randomly selecting one state from all states, and selecting the action a with the maximum Q value in the state;
step 4.3, the agent of the agent inputs the action a into the environment interaction model env to obtain a new state s' at the next moment;
step 4.4, all actions in s ' are checked, and the action with the maximum Q value in s ' is taken as a ';
step 4.5, judging whether the iteration process is qualified: obtaining the temperature T 'of the electrolyte at the next moment according to the new state s' at the next moment, and if T is metmin≤T'≤TmaxIf yes, the iteration process is qualified, the reward is 0, the step 4.6 is continued, and if the iteration process does not meet the requirement Tmin≤T'≤TmaxIf so, the iteration process fails, the reward is-1, and the step 4.1 is returned again;
step 4.6, updating the Q value of the current action a by using the following formula:
Figure FDA0003556712820000021
step 4.7, orderAnd returning to the step 4.3, and continuing to loop, wherein 4.3-4.7 are defined as a step, and the process of interaction between the agent and the environment interaction model env is the stepmaxA process of updating the Q table;
step 4.8, step is added up to the user-specified value stepmaxAnd the Q table updating step is finished.
CN202210277803.6A 2022-03-21 2022-03-21 Model-free self-learning stable control method for electrolyte temperature in zinc electrolysis process Active CN114507881B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210277803.6A CN114507881B (en) 2022-03-21 2022-03-21 Model-free self-learning stable control method for electrolyte temperature in zinc electrolysis process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210277803.6A CN114507881B (en) 2022-03-21 2022-03-21 Model-free self-learning stable control method for electrolyte temperature in zinc electrolysis process

Publications (2)

Publication Number Publication Date
CN114507881A true CN114507881A (en) 2022-05-17
CN114507881B CN114507881B (en) 2024-01-02

Family

ID=81554795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210277803.6A Active CN114507881B (en) 2022-03-21 2022-03-21 Model-free self-learning stable control method for electrolyte temperature in zinc electrolysis process

Country Status (1)

Country Link
CN (1) CN114507881B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118393896A (en) * 2024-05-24 2024-07-26 深圳三友智能自动化设备有限公司 Solder paste tempering time control method, system, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1467310A (en) * 2002-07-09 2004-01-14 中南大学 Time sharing power supply optimization scheduling technology in zinc electrolysis course
US20070125641A1 (en) * 2003-11-27 2007-06-07 Ari Rantala Method for defining status index in copper electrolysis
US20110054802A1 (en) * 2007-11-30 2011-03-03 Outotec Oyj Method and arrangement for monitoring and presenting the status of an electrolytic process in an electrolytic cell
CN107423461A (en) * 2017-03-27 2017-12-01 中南大学 The measuring method and system of operating mode are electrolysed in a kind of process in zinc electrolyzing
JP2021176131A (en) * 2020-05-01 2021-11-04 ダイキン工業株式会社 Learning model generation method, program, storage medium and learned model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1467310A (en) * 2002-07-09 2004-01-14 中南大学 Time sharing power supply optimization scheduling technology in zinc electrolysis course
US20070125641A1 (en) * 2003-11-27 2007-06-07 Ari Rantala Method for defining status index in copper electrolysis
US20110054802A1 (en) * 2007-11-30 2011-03-03 Outotec Oyj Method and arrangement for monitoring and presenting the status of an electrolytic process in an electrolytic cell
CN107423461A (en) * 2017-03-27 2017-12-01 中南大学 The measuring method and system of operating mode are electrolysed in a kind of process in zinc electrolyzing
JP2021176131A (en) * 2020-05-01 2021-11-04 ダイキン工業株式会社 Learning model generation method, program, storage medium and learned model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LIU, TIANHAO等: ""Integrated Optimal Control for Electrolyte Temperature With Temporal Causal Network and Reinforcement Learning"", 《IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS》 *
刘天豪: ""基于深度强化学习的湿法炼锌电解液温度优化控制"", pages 4 *
蒋春翔 等: ""模拟仿真在湿法炼锌和炼铜中的应用"", 《材料科学与工艺》, pages 70 - 83 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118393896A (en) * 2024-05-24 2024-07-26 深圳三友智能自动化设备有限公司 Solder paste tempering time control method, system, equipment and storage medium

Also Published As

Publication number Publication date
CN114507881B (en) 2024-01-02

Similar Documents

Publication Publication Date Title
CN105447567B (en) Aluminium electroloysis energy-saving and emission-reduction control method based on BP neural network Yu MPSO algorithms
CN114507881A (en) Model-free self-learning stability control method for electrolyte temperature in zinc electrolysis process
CN115164361B (en) Data center control method and device, electronic equipment and storage medium
CN115094482A (en) Alkali liquor electrolysis hydrogen production control method adapting to wide power fluctuation
CN115345380A (en) New energy consumption electric power scheduling method based on artificial intelligence
CN117556969B (en) Flexible power distribution network distributed reactive power optimization method based on probability scene driving
CN110109356A (en) The learning-oriented optimized control method and system of process in zinc electrolyzing model-free adaption
CN117375097A (en) Photovoltaic coordination autonomous method based on multi-agent coordination control strategy and reinforcement learning
CN110598894A (en) Data processing method and device for energy Internet and electronic equipment
CN111160808A (en) Distributed event-triggered power system economic dispatching method with uncertain parameters
JP2008231537A (en) Electrolysis simulation apparatus
CN115392045A (en) Temporary building air conditioner heating operation optimization method, device, equipment and medium
CN112994036B (en) Temperature control load participation micro-grid regulation and control method and system based on model prediction
CN110376896A (en) It is a kind of that refrigerating method is optimized based on deep learning and the single heat source air-conditioning of fuzzy control
CN113019594B (en) Ball mill pulverizing optimization control method based on multi-agent TLBO algorithm
CN115526504A (en) Energy-saving scheduling method and system for water supply system of pump station, electronic equipment and storage medium
CN107835127A (en) A kind of intra-area routes power-economizing method based on network entropy
CN112084645A (en) Energy management method of energy storage system of lithium ion battery based on hybrid iterative ADP method
CN115566706B (en) Fuzzy control method for alkaline electrolysis hydrogen production system
CN104573882B (en) A kind of net circulation cooling water system comprehensive optimization method based on layering nested algorithm
Zhang et al. Economic Tube-Based Robust Model Predictive Control for HAVC System
CN117687300A (en) Three-phase arc furnace electrode regulating system optimal control method based on layered Stackelberg-Nash game
CN118564399A (en) Pitch changing method, device and equipment of wind motor and storage medium
CN115583654A (en) Polysilicon reduction furnace current control method based on simulation learning
CN115579965A (en) Garden comprehensive energy transient optimization control method and system based on digital twins

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant