CN116819974B - Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning - Google Patents
Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning Download PDFInfo
- Publication number
- CN116819974B CN116819974B CN202311102920.XA CN202311102920A CN116819974B CN 116819974 B CN116819974 B CN 116819974B CN 202311102920 A CN202311102920 A CN 202311102920A CN 116819974 B CN116819974 B CN 116819974B
- Authority
- CN
- China
- Prior art keywords
- water quality
- index
- gate
- state
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 230000002787 reinforcement Effects 0.000 title claims abstract description 20
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims abstract description 102
- 238000004458 analytical method Methods 0.000 claims abstract description 7
- 238000012544 monitoring process Methods 0.000 claims abstract description 5
- 238000013528 artificial neural network Methods 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 14
- 238000012549 training Methods 0.000 claims description 14
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 claims description 7
- XKMRRTOUMJRJIA-UHFFFAOYSA-N ammonia nh3 Chemical compound N.N XKMRRTOUMJRJIA-UHFFFAOYSA-N 0.000 claims description 7
- 229910052698 phosphorus Inorganic materials 0.000 claims description 7
- 239000011574 phosphorus Substances 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000000007 visual effect Effects 0.000 claims description 6
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000010304 firing Methods 0.000 claims 1
- 239000010865 sewage Substances 0.000 abstract description 14
- 239000003344 environmental pollutant Substances 0.000 description 4
- 231100000719 pollutant Toxicity 0.000 description 4
- 238000004891 communication Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
Classifications
-
- E—FIXED CONSTRUCTIONS
- E03—WATER SUPPLY; SEWERAGE
- E03F—SEWERS; CESSPOOLS
- E03F1/00—Methods, systems, or installations for draining-off sewage or storm water
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/18—Water
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A20/00—Water conservation; Efficient water supply; Efficient water use
- Y02A20/152—Water filtration
Abstract
The invention relates to the technical field of urban pipe network drainage overflow monitoring, in particular to an intelligent drainage method and system for the tail end of a drainage pipe network based on deep reinforcement learning, wherein the method comprises the following steps: s1, a water quality acquisition terminal is arranged at a drainage port at the tail end of a drainage pipe network in advance, and real-time water quality data are acquired; s2, analyzing the collected real-time water quality data based on a pre-trained DQN model, and controlling a gate at the discharge port to execute opening or closing actions according to an analysis result; and S3, visually displaying the real-time water quality data and the gate state. According to the invention, the opening and closing of the sewage interception gate can be automatically adjusted in real time according to the sewage state, so that the manpower consumption can be reduced, and the real-time state of the discharge opening is provided for the manager.
Description
Technical Field
The invention relates to the technical field of urban pipe network drainage overflow monitoring, in particular to an intelligent drainage method and system for the tail end of a drainage pipe network based on deep reinforcement learning.
Background
At present, a closure mode is adopted for the urban drainage pipe network terminal drainage treatment technology, but the sewage closure multiple of the traditional closure type confluence pipe network is only 1, the design is carried out according to 2 times of the dry season second flow, the closure of the confluence sewage with serious rain pollution cannot be carried out, and the design flow of the sewage closure pipe is far less than the peak flow of the rain. The traditional interception mode can only control the total annual runoff pollution amount, and is difficult to control the pollutants in each field runoff, especially when the rainfall intensity is large and the rainfall is small, the pollutants overflow seriously. The sewage interception mode needs manual operation, the manual operation is not timely and easy to cause the overflow of pollutants, and the sewage interception mode consumes manpower and has a certain danger in the river operation in rainy seasons.
Disclosure of Invention
In view of the above, the invention provides a deep reinforcement learning-based intelligent drainage method and system for the tail end of a drainage pipe network, which can automatically adjust the opening and closing of a sewage interception gate in real time according to the sewage state, can reduce the manpower consumption and provide a real-time drainage state for management staff.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the present invention provides a deep reinforcement learning-based intelligent drainage method for a drainage pipe network end, including the following steps:
s1, a water quality acquisition terminal is arranged at a drainage port at the tail end of a drainage pipe network in advance, and real-time water quality data are acquired;
s2, analyzing the collected real-time water quality data based on a pre-trained DQN model, and controlling a gate at the discharge port to execute opening or closing actions according to an analysis result;
and S3, visually displaying the real-time water quality data and the gate state.
Further, the water quality data includes: ammonia nitrogen content, total phosphorus content, COD data and TDS data.
Further, in S2, the training process for the DQN model includes:
s201: initializing the weight theta, iteration times threshold, experience pool and iteration times of a Q neural network of the DQN model;
s202: reading current water quality data, including: the data of ammonia nitrogen, total phosphorus, COD and TDS are established, and the initial state value s of the tail end discharge of the current drainage pipe network is established t ;
S203: for the current state value s t Judging, wherein the judging mode is as follows: is thatIf any index is not less than 90% of the standard value of the index emission, continuing to step S204 if any index is not less than 90% of the standard value of the index emission, and repeating steps S202-S203 if no index is not;
s204: calculating a comprehensive pollution index, and if the comprehensive pollution index is higher than or equal to the emission standard, taking the Q value under the index as an optimal threshold for controlling the opening of the gate and controlling the gate action a t For on, continue step S205; if the comprehensive pollution index is lower than the emission standard, controlling the gate action a t Closing and continuing the steps S202-S204;
s205: collecting water quality data of the tail end discharge port of the water discharge pipe network at the next time step to obtain a state s t+1 Feedback value r t Will(s) t , a t , r t , s t+1 ) Storing into an experience pool;
s206: judging whether the experience pool is full, if not, repeating S202-S205; if the training is full, the Q neural network is trained repeatedly, S202-S206 are executed repeatedly, and the weight theta of the Q neural network is updated until a preset training target is met.
Further, in S206, the weight θ is updated by using a gradient descent method for the loss function according to the Q value calculated by the DQN algorithm;
the loss function is: l (L) i (θ i )=E s,a~ρ(s,a) [(y i -Q(s,a;θ i )) 2 ];
Wherein Q (s, a; θ) i ) An estimated value representing Q (s, a); e represents the desire; y is i Represents Q (s, a); i is the number of iterations; θ i Representing the weight of the Q neural network under the ith iteration; s represents the current water quality state; a represents a current command for controlling the opening of the shutter.
Further, in S206, Q (S, a) obtained from the Bellman equation is denoted as y i ,y i The specific calculation formula of (2) is as follows: y is i =E s ′[r+γmax a ′Q i (s′ , a′ ; θ i - 1 )|s,a];
Wherein E is s ' indicating the desire for the current water quality status; r represents the value obtained after performing action a;max a ' expressed in all actions a ' the maximum Q value in; q (Q) i (s′ , a′ ; θ i -1) represents the Q value of each action a 'in the next state s' after execution of action a; the expression is under (s, a); gamma represents the attenuation coefficient; s' represents the state of water quality after the gate is opened; a' represents the action threshold for the opening of the next control gate.
Further, in S204, an optimal threshold selection function is defined as Q (S, a), and Q (S, a) is calculated by the function as follows: q (s, a) =max pi E [ r ] t |s t =s,a t =a|π];
Wherein E represents a desire; s represents the state of water quality; a is an instruction for controlling the opening of the gate in the state; pi represents action and state mapping; s is(s) t The water quality state is the water quality state at the time step t; a, a t Is s t Transmitting a gate opening instruction in a state; r is (r) t Is the water quality state s t And transmitting a feedback value obtained by the gate opening command a.
Further, in S204, a gate opening instruction with the largest Q value is selected according to the epsilon-greedy rule; the epsilon-greedy rule is to select the action with the largest Q value according to the probability of 1-epsilon, randomly select the action according to the probability of epsilon, and randomly explore an unknown state space.
Further, in S205, when the water quality state is S, a gate opening command a is transmitted to obtain a feedback value r t Feedback value r t The calculation formula of (2) is as follows:
;
wherein t represents the current time step; t' represents the time step of opening the gate; mu (mu) t’-t A discount factor representing a time step from when the shutter is opened to a current time step; r is (r) t’ A prize value representing a step in time when the gate is open; r is (r) t The feedback value is the t time step; mu represents the discount factor.
In a second aspect, the present invention provides a drainage network terminal intelligent drainage method system based on deep reinforcement learning, including: the intelligent water quality monitoring system comprises a water quality acquisition terminal, an intelligent drainage device, a remote control terminal and a visual platform;
the water quality acquisition terminal is used for acquiring real-time water quality data;
the remote control terminal is used for analyzing the collected real-time water quality data based on a pre-trained DQN model and controlling a gate of the intelligent drainage device to execute opening or closing actions according to analysis results;
the visual platform is used for visually displaying real-time water quality data and gate states.
Compared with the prior art, the invention has the following beneficial effects:
according to the invention, the intelligent drainage port is arranged at the tail end of the urban drainage pipe network, and the water quality acquisition terminal is arranged to acquire the water quality state at the drainage port in real time, the real-time water quality state is analyzed and judged according to the pre-trained DQN (deep reinforcement learning) model, whether the current water quality meets the emission standard is judged, the remote control gate is in a closed or open state, and the real-time water quality data and the gate state are visually displayed. The whole process does not need to participate in a human field, and the opening and closing of the sewage interception gate can be automatically adjusted in real time according to the sewage state, so that the manpower consumption is reduced, and the danger is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an intelligent drainage method at the tail end of a drainage pipe network based on deep reinforcement learning;
fig. 2 is a schematic structural diagram of an intelligent drainage system at the tail end of a drainage pipe network based on deep reinforcement learning.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, the embodiment of the invention discloses an intelligent drainage method for the tail end of a drainage pipe network based on deep reinforcement learning, which comprises the following steps:
s1, a water quality acquisition terminal is arranged at a drainage port at the tail end of a drainage pipe network in advance, and real-time water quality data are acquired; the water quality data includes: ammonia nitrogen content, total phosphorus content, COD data and TDS data;
s2, analyzing the collected real-time water quality data based on a pre-trained DQN model, and controlling a gate at the discharge port to execute opening or closing actions according to an analysis result; because the corresponding representative index exists at each water outlet, if the representative index is more than or equal to 90% of the index emission standard, the gate is opened, water flows to the sewage pipeline to the sewage treatment plant, and if the representative index is less than 90% of the index emission standard, the water is judged to reach the standard, the gate is closed, and the water flows to the river channel.
And S3, visually displaying the real-time water quality data and the gate state.
In a specific embodiment, in S2, the training process for the DQN model includes:
s201: initializing the weight theta, the iteration number threshold, the experience pool and the iteration number of the Q neural network of the DQN model.
S202: reading current water quality data, including: the data of ammonia nitrogen, total phosphorus, COD and TDS are established, and the initial state value s of the tail end discharge of the current drainage pipe network is established t 。
S203: for the current state value s t Judging, wherein the judging mode is as follows: if any index is greater than or equal to 90% of the standard value, continuing step S204 if any index is greater than or equal to 90% of the standard value, otherwiseAnd repeating the steps S202-S203.
S204: calculating a comprehensive pollution index, and if the comprehensive pollution index is higher than or equal to the emission standard, taking the Q value under the index as an optimal threshold for controlling the opening of the gate and controlling the gate action a t For on, continue step S205; if the comprehensive pollution index is lower than the emission standard, controlling the gate action a t And closing, and continuing steps S202-S204.
Because the sewage quality at each water outlet is different, typical pollutants are different, and an index with the highest linear correlation with the comprehensive pollution index is iterated from the four water quality data indexes according to the DQN algorithm to serve as a representative index of the corresponding water outlet, and the representative index at each water outlet can be different.
In this step, an optimal threshold selection function is defined as Q (s, a), and the Q value is calculated from this function, and the formula of Q (s, a) is: q (s, a) =max pi E [ r ] t |s t =s,a t =a|π];
The equation represents the largest expected cumulative expected value for state s action a in all policies pi, E represents the expected; s represents the state of water quality; a is an instruction for controlling the opening of the gate in the state; pi represents action and state mapping; s is(s) t The water quality state is the water quality state at the time step t; a, a t Is s t Transmitting a gate opening instruction in a state; r is (r) t Is the water quality state s t And transmitting a feedback value obtained by the gate opening command a.
Selecting a gate opening instruction with the maximum Q value according to an epsilon-greedy rule; the epsilon-greedy rule is to select the action with the largest Q value according to the probability of 1-epsilon, randomly select the action according to the probability of epsilon, and randomly explore an unknown state space.
S205: collecting water quality data of the tail end discharge port of the water discharge pipe network at the next time step to obtain a state s t+1 Feedback value r t Will(s) t , a t , r t , s t+1 ) And storing into an experience pool.
In the step, when the water quality state is s, a gate opening instruction a is transmitted to obtain a feedback value r t Feedback value r t The calculation formula of (2) is as follows:
;
wherein t represents the current time step; t' represents the time step of opening the gate; mu (mu) t’-t A discount factor representing a time step from when the shutter is opened to a current time step; r is (r) t’ A prize value representing a step in time when the gate is open; r is (r) t The feedback value is the t time step; mu represents the discount factor.
S206: judging whether the experience pool is full, if not, repeating S202-S205, and continuously collecting samples; if the training is full, training the Q neural network, repeatedly executing S202-S206, and updating the weight theta of the Q neural network until the preset training target is met.
In this step, the weight θ is updated by using a gradient descent method for the loss function according to the Q value calculated by the DQN algorithm.
The loss function is: l (L) i (θ i )=E s,a~ρ(s,a) [(y i -Q(s,a;θ i )) 2 ];
Wherein Q (s, a; θ) i ) Representing an estimate of the Q (s, a) neural network; e (E) s,a~ρ(s,a) The subscripts s, a- ρ (s, a) represent the probability distribution between the water quality state s and the gate execution action a; y is i Representing the Q value obtained by the ith iteration; i is the number of iterations; θ i Representing the weight of the Q neural network under the ith iteration; s represents the current water quality state; a represents a current command for controlling the opening of the shutter.
Q (s, a) derived from the Bellman equation is denoted as y i ;
The Bellman equation is: v(s) =max a (R(s,a)+γV(s′));
Wherein R is a reward function; s is the water quality state at a specific time point; a is the action taken after the current state is calculated; v (s') is the discount cost function for the subsequent state; gamma is the attenuation coefficient; s' represents the subsequent water quality status; v(s) represents the cost function in the state of s at a particular point in time.
y i The specific calculation formula of (2) is as follows: y is i =E s ′[r+γmax a ′Q i (s′ , a′ ; θ i - 1 )|s,a];
Wherein E is 3 ' indicating the desire for the current water quality status; r represents the value obtained after performing action a; max (max) a ' expressed in all actions a ' the maximum Q value in; q (Q) i (s′ , a′ ; θ i -1) represents the Q value of each action a 'in the next state s' after execution of action a; the expression is under (s, a); gamma represents the attenuation coefficient; s' represents the state of water quality after the gate is opened; a' represents the action threshold for the opening of the next control gate.
In other embodiments, as shown in fig. 2, the present invention further provides an intelligent drainage system at the end of a drainage pipe network based on deep reinforcement learning, including: the intelligent water quality monitoring system comprises a water quality acquisition terminal, an intelligent drainage device, a remote control terminal and a visual platform;
the water quality acquisition terminal is used for acquiring real-time water quality data;
the remote control terminal is used for analyzing the collected real-time water quality data based on a pre-trained DQN model and controlling a gate of the intelligent drainage device to execute opening or closing actions according to an analysis result;
the visual platform is used for visually displaying real-time water quality data and gate states.
Specifically, the remote control terminal comprises a platform communication unit, a data processing unit, a data experience pool, a model server, a PLC controller and a gate starter; the water quality acquisition terminal and the platform communication unit are communicated through the RTU remote terminal, water quality data are transmitted to the data processing unit to carry out operations such as duplicate removal and screening, then the model server is utilized to train the DQN model, and a gate opening and closing instruction is sent to the PLC according to a judging result of the model, and the PLC controls the opening and closing state of the gate.
In addition, the water quality detection terminal is also integrated with a liquid level sensor for realizing liquid level acquisition at the tail end discharge port of the drainage pipe network, and simultaneously, the real-time power consumption of the water quality detection terminal can be monitored.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (8)
1. The intelligent drainage method for the tail end of the drainage pipe network based on deep reinforcement learning is characterized by comprising the following steps of:
s1, a water quality acquisition terminal is arranged at a drainage port at the tail end of a drainage pipe network in advance, and real-time water quality data are acquired;
s2, analyzing the collected real-time water quality data based on a pre-trained DQN model, and controlling a gate at the discharge port to execute opening or closing actions according to an analysis result; each water outlet is provided with a corresponding representative index, if the representative index is more than or equal to 90% of the index emission standard, the gate is opened, and if the representative index is less than 90% of the index emission standard, the water is judged to reach the standard, and the gate is closed;
s3, visually displaying the real-time water quality data and the gate state;
in S2, the training process for the DQN model includes:
s201: initializing the weight theta, iteration times threshold, experience pool and iteration times of a Q neural network of the DQN model;
s202: reading current water quality data, including: ammonia nitrogen, total phosphorus, COD and TDS numberAccording to the initial state value s of the end discharge port of the current drainage pipe network is established t ;
S203: for the current state value s t Judging, wherein the judging mode is as follows: if any index is more than or equal to 90% of the standard value of the index emission, continuing to step S204 if any index is more than or equal to 90% of the standard value of the index emission, and repeating steps S202-S203 if no index is more than or equal to 90% of the standard value of the index emission;
s204: calculating a comprehensive pollution index, and if the comprehensive pollution index is higher than or equal to the emission standard, taking the Q value under the index as an optimal threshold for controlling the opening of the gate and controlling the gate action a t For on, continue step S205; if the comprehensive pollution index is lower than the emission standard, controlling the gate action a t Closing and continuing the steps S202-S204;
iterating an index with the highest linear correlation with the comprehensive pollution index from the four water quality data indexes according to the DQN algorithm to serve as a representative index of the corresponding water outlet;
s205: collecting water quality data of the tail end discharge port of the water discharge pipe network at the next time step to obtain a state s t+1 Feedback value r t Will(s) t , a t , r t , s t+1 ) Storing into an experience pool;
s206: judging whether the experience pool is full, if not, repeating S202-S205; if the training is full, the Q neural network is trained repeatedly, S202-S206 are executed repeatedly, and the weight theta of the Q neural network is updated until a preset training target is met.
2. The deep reinforcement learning-based intelligent drainage method at the tail end of a drainage pipe network according to claim 1, wherein the water quality data comprises: ammonia nitrogen content, total phosphorus content, COD data and TDS data.
3. The deep reinforcement learning-based intelligent drainage method at the end of a drainage pipe network according to claim 1, wherein in S206, the weight θ is updated by using a gradient descent method on the loss function according to the Q value calculated by the DQN algorithm;
the loss function is: l (L) i (θ i )=E s,a~ρ(s,a) [(y i -Q(s,a;θ i )) 2 ];
Wherein Q (s, a; θ) i ) An estimated value representing Q (s, a); e (E) s,a~ρ(s,a) The subscripts s, a- ρ (s, a) represent the probability distribution between the water quality state s and the gate execution action a; y is i Represents Q (s, a); i is the number of iterations; θ i Representing the weight of the Q neural network under the ith iteration; s represents the current water quality state; a represents a current command for controlling the opening of the shutter.
4. The intelligent drainage method based on deep reinforcement learning of the drainage network end of claim 3, wherein in S206, Q (S, a) obtained according to Bellman equation is denoted as y i ,y i The specific calculation formula of (2) is as follows: y is i =E s ′[r+γmax a ′Q i (s′ , a′ ; θ i - 1 )|s,a];
Wherein E is s ' indicating the desire for the current water quality status; r represents the value obtained after performing action a; max (max) a ' expressed in all actions a ' the maximum Q value in; q (Q) i (s′ , a′ ; θ i -1) represents the Q value of each action a 'in the next state s' after execution of action a; the expression is under (s, a); gamma represents the attenuation coefficient; s' represents the state of water quality after the gate is opened; a' represents the action threshold for the opening of the next control gate.
5. The intelligent drainage method of a drainage network terminal based on deep reinforcement learning according to claim 1, wherein in S204, an optimal threshold selection function is defined as Q (S, a), and the formula of Q (S, a) is as follows: q (s, a) =max pi E [ r ] t |s t =s,a t =a|π];
Wherein E represents a desire; s represents the state of water quality; a is an instruction for controlling the opening of the gate in the state; pi represents action and state mapping; s is(s) t The water quality state is the water quality state at the time step t; a, a t Is s t In the state of firing gate openingAn instruction; r is (r) t Is the water quality state s t And transmitting a feedback value obtained by the gate opening command a.
6. The intelligent drainage method of the drainage pipe network end based on the deep reinforcement learning according to claim 1, wherein in S204, a gate opening instruction with the largest Q value is selected according to epsilon-greedy rule; the epsilon-greedy rule is to select the action with the largest Q value according to the probability of 1-epsilon, randomly select the action according to the probability of epsilon, and randomly explore an unknown state space.
7. The intelligent drainage method of the drainage pipe network end based on deep reinforcement learning according to claim 1, wherein in S205, when the water quality state is S, a gate opening instruction a is transmitted, and the obtained feedback value r is obtained t Feedback value r t The calculation formula of (2) is as follows:
;
wherein t represents the current time step; t' represents the time step of opening the gate; mu (mu) t’-t A discount factor representing a time step from when the shutter is opened to a current time step; r is (r) t’ A prize value representing a step in time when the gate is open; r is (r) t The feedback value is the t time step; mu represents the discount factor.
8. Intelligent drainage system at end of drainage pipe network based on degree of depth reinforcement study, its characterized in that includes: the intelligent water quality monitoring system comprises a water quality acquisition terminal, an intelligent drainage device, a remote control terminal and a visual platform;
the water quality acquisition terminal is used for acquiring real-time water quality data;
the remote control terminal is used for analyzing the collected real-time water quality data based on a pre-trained DQN model and controlling a gate of the intelligent drainage device to execute opening or closing actions according to analysis results; each water outlet is provided with a corresponding representative index, if the representative index is more than or equal to 90% of the index emission standard, the gate is opened, and if the representative index is less than 90% of the index emission standard, the water is judged to reach the standard, and the gate is closed;
the training process for the DQN model comprises:
s201: initializing the weight theta, iteration times threshold, experience pool and iteration times of a Q neural network of the DQN model;
s202: reading current water quality data, including: the data of ammonia nitrogen, total phosphorus, COD and TDS are established, and the initial state value s of the tail end discharge of the current drainage pipe network is established t ;
S203: for the current state value s t Judging, wherein the judging mode is as follows: if any index is more than or equal to 90% of the standard value of the index emission, continuing to step S204 if any index is more than or equal to 90% of the standard value of the index emission, and repeating steps S202-S203 if no index is more than or equal to 90% of the standard value of the index emission;
s204: calculating a comprehensive pollution index, and if the comprehensive pollution index is higher than or equal to the emission standard, taking the Q value under the index as an optimal threshold for controlling the opening of the gate and controlling the gate action a t For on, continue step S205; if the comprehensive pollution index is lower than the emission standard, controlling the gate action a t Closing and continuing the steps S202-S204;
iterating an index with the highest linear correlation with the comprehensive pollution index from the four water quality data indexes according to the DQN algorithm to serve as a representative index of the corresponding water outlet;
s205: collecting water quality data of the tail end discharge port of the water discharge pipe network at the next time step to obtain a state s t+1 Feedback value r t Will(s) t , a t , r t , s t+1 ) Storing into an experience pool;
s206: judging whether the experience pool is full, if not, repeating S202-S205; if the training is full, repeatedly training the Q neural network, repeatedly executing S202-S206, and updating the weight theta of the Q neural network until the preset training target is met;
the visual platform is used for visually displaying real-time water quality data and gate states.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311102920.XA CN116819974B (en) | 2023-08-30 | 2023-08-30 | Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311102920.XA CN116819974B (en) | 2023-08-30 | 2023-08-30 | Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116819974A CN116819974A (en) | 2023-09-29 |
CN116819974B true CN116819974B (en) | 2023-11-03 |
Family
ID=88120686
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311102920.XA Active CN116819974B (en) | 2023-08-30 | 2023-08-30 | Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116819974B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111489052A (en) * | 2020-03-10 | 2020-08-04 | 上海水顿智能科技有限公司 | Method for carrying out intercepting drainage scheduling by utilizing water quality and water quantity |
CN113850692A (en) * | 2021-09-26 | 2021-12-28 | 天津大学 | Urban water supply system gate pump group optimal scheduling method based on deep learning |
CN115761316A (en) * | 2022-11-08 | 2023-03-07 | 中国长江电力股份有限公司 | Hydropower station flat gate opening and closing method based on YOLO automatic identification |
CN116187208A (en) * | 2023-04-27 | 2023-05-30 | 深圳市广汇源环境水务有限公司 | Drainage basin water quantity and quality joint scheduling method based on constraint reinforcement learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220328189A1 (en) * | 2021-04-09 | 2022-10-13 | Arizona Board Of Regents On Behalf Of Arizona State University | Systems, methods, and apparatuses for implementing advancements towards annotation efficient deep learning in computer-aided diagnosis |
-
2023
- 2023-08-30 CN CN202311102920.XA patent/CN116819974B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111489052A (en) * | 2020-03-10 | 2020-08-04 | 上海水顿智能科技有限公司 | Method for carrying out intercepting drainage scheduling by utilizing water quality and water quantity |
CN113850692A (en) * | 2021-09-26 | 2021-12-28 | 天津大学 | Urban water supply system gate pump group optimal scheduling method based on deep learning |
CN115761316A (en) * | 2022-11-08 | 2023-03-07 | 中国长江电力股份有限公司 | Hydropower station flat gate opening and closing method based on YOLO automatic identification |
CN116187208A (en) * | 2023-04-27 | 2023-05-30 | 深圳市广汇源环境水务有限公司 | Drainage basin water quantity and quality joint scheduling method based on constraint reinforcement learning |
Also Published As
Publication number | Publication date |
---|---|
CN116819974A (en) | 2023-09-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111222698B (en) | Internet of things-oriented ponding water level prediction method based on long-time and short-time memory network | |
CN103903452B (en) | Forecasting Approach for Short-term Traffic Flow | |
CN105512767B (en) | A kind of Flood Forecasting Method of more leading times | |
CN104318325B (en) | Many basin real-time intelligent water quality prediction methods and system | |
CN108898215B (en) | Intelligent sludge bulking identification method based on two-type fuzzy neural network | |
CN107480775A (en) | A kind of dissolved oxygen in fish pond Forecasting Methodology based on data reparation | |
CN104899653B (en) | Lake storehouse blue-green alga bloom Forecasting Methodology based on expert system and blue algae growth mechanism temporal model | |
CN108876021B (en) | Medium-and-long-term runoff forecasting method and system | |
CN111652425A (en) | River water quality prediction method based on rough set and long and short term memory network | |
CN106200381B (en) | A method of according to the operation of processing water control by stages water factory | |
CN111695290A (en) | Short-term runoff intelligent forecasting hybrid model method suitable for variable environment | |
CN112330065A (en) | Runoff forecasting method based on basic flow segmentation and artificial neural network model | |
CN107229970B (en) | The adaptive dynamic self study on-line monitoring system of shared direct drinking water quality | |
CN108710964A (en) | A kind of prediction technique of Fuzzy time sequence aquaculture water quality environmental data | |
CN114942948A (en) | Drainage pipe network diagnosis and management method | |
CN116819974B (en) | Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning | |
KR101585545B1 (en) | A method of Wavelet-based autoregressive fuzzy modeling for forecasting algal blooms | |
CN108894308B (en) | Reservoir and water quality monitoring method thereof | |
CN103955743A (en) | Ultrahigh-pressure water jet road mark line cleaning effect forecasting method and device | |
CN107192802B (en) | Shared direct drinking on-line water quality monitoring method and system | |
CN115618720A (en) | Soil salinization prediction method and system based on altitude | |
CN115204688A (en) | Comprehensive evaluation method for health of drainage system | |
WO2022032873A1 (en) | Adversarial neural network-based hydrological parameter calibration method for data-lacking region | |
CN113723708A (en) | Urban daily water consumption prediction method based on machine learning | |
CN109313416B (en) | Method, computer program product and system for dynamically controlling a fluid network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |