CN116819974B - Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning - Google Patents

Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning Download PDF

Info

Publication number
CN116819974B
CN116819974B CN202311102920.XA CN202311102920A CN116819974B CN 116819974 B CN116819974 B CN 116819974B CN 202311102920 A CN202311102920 A CN 202311102920A CN 116819974 B CN116819974 B CN 116819974B
Authority
CN
China
Prior art keywords
water quality
index
gate
state
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311102920.XA
Other languages
Chinese (zh)
Other versions
CN116819974A (en
Inventor
袁冬海
李雷
王旻昊
王辉
申宇洋
王家卓
寇莹莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Civil Engineering and Architecture
Original Assignee
Beijing University of Civil Engineering and Architecture
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Civil Engineering and Architecture filed Critical Beijing University of Civil Engineering and Architecture
Priority to CN202311102920.XA priority Critical patent/CN116819974B/en
Publication of CN116819974A publication Critical patent/CN116819974A/en
Application granted granted Critical
Publication of CN116819974B publication Critical patent/CN116819974B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • EFIXED CONSTRUCTIONS
    • E03WATER SUPPLY; SEWERAGE
    • E03FSEWERS; CESSPOOLS
    • E03F1/00Methods, systems, or installations for draining-off sewage or storm water
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/18Water
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Abstract

The invention relates to the technical field of urban pipe network drainage overflow monitoring, in particular to an intelligent drainage method and system for the tail end of a drainage pipe network based on deep reinforcement learning, wherein the method comprises the following steps: s1, a water quality acquisition terminal is arranged at a drainage port at the tail end of a drainage pipe network in advance, and real-time water quality data are acquired; s2, analyzing the collected real-time water quality data based on a pre-trained DQN model, and controlling a gate at the discharge port to execute opening or closing actions according to an analysis result; and S3, visually displaying the real-time water quality data and the gate state. According to the invention, the opening and closing of the sewage interception gate can be automatically adjusted in real time according to the sewage state, so that the manpower consumption can be reduced, and the real-time state of the discharge opening is provided for the manager.

Description

Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning
Technical Field
The invention relates to the technical field of urban pipe network drainage overflow monitoring, in particular to an intelligent drainage method and system for the tail end of a drainage pipe network based on deep reinforcement learning.
Background
At present, a closure mode is adopted for the urban drainage pipe network terminal drainage treatment technology, but the sewage closure multiple of the traditional closure type confluence pipe network is only 1, the design is carried out according to 2 times of the dry season second flow, the closure of the confluence sewage with serious rain pollution cannot be carried out, and the design flow of the sewage closure pipe is far less than the peak flow of the rain. The traditional interception mode can only control the total annual runoff pollution amount, and is difficult to control the pollutants in each field runoff, especially when the rainfall intensity is large and the rainfall is small, the pollutants overflow seriously. The sewage interception mode needs manual operation, the manual operation is not timely and easy to cause the overflow of pollutants, and the sewage interception mode consumes manpower and has a certain danger in the river operation in rainy seasons.
Disclosure of Invention
In view of the above, the invention provides a deep reinforcement learning-based intelligent drainage method and system for the tail end of a drainage pipe network, which can automatically adjust the opening and closing of a sewage interception gate in real time according to the sewage state, can reduce the manpower consumption and provide a real-time drainage state for management staff.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the present invention provides a deep reinforcement learning-based intelligent drainage method for a drainage pipe network end, including the following steps:
s1, a water quality acquisition terminal is arranged at a drainage port at the tail end of a drainage pipe network in advance, and real-time water quality data are acquired;
s2, analyzing the collected real-time water quality data based on a pre-trained DQN model, and controlling a gate at the discharge port to execute opening or closing actions according to an analysis result;
and S3, visually displaying the real-time water quality data and the gate state.
Further, the water quality data includes: ammonia nitrogen content, total phosphorus content, COD data and TDS data.
Further, in S2, the training process for the DQN model includes:
s201: initializing the weight theta, iteration times threshold, experience pool and iteration times of a Q neural network of the DQN model;
s202: reading current water quality data, including: the data of ammonia nitrogen, total phosphorus, COD and TDS are established, and the initial state value s of the tail end discharge of the current drainage pipe network is established t
S203: for the current state value s t Judging, wherein the judging mode is as follows: is thatIf any index is not less than 90% of the standard value of the index emission, continuing to step S204 if any index is not less than 90% of the standard value of the index emission, and repeating steps S202-S203 if no index is not;
s204: calculating a comprehensive pollution index, and if the comprehensive pollution index is higher than or equal to the emission standard, taking the Q value under the index as an optimal threshold for controlling the opening of the gate and controlling the gate action a t For on, continue step S205; if the comprehensive pollution index is lower than the emission standard, controlling the gate action a t Closing and continuing the steps S202-S204;
s205: collecting water quality data of the tail end discharge port of the water discharge pipe network at the next time step to obtain a state s t+1 Feedback value r t Will(s) t , a t , r t , s t+1 ) Storing into an experience pool;
s206: judging whether the experience pool is full, if not, repeating S202-S205; if the training is full, the Q neural network is trained repeatedly, S202-S206 are executed repeatedly, and the weight theta of the Q neural network is updated until a preset training target is met.
Further, in S206, the weight θ is updated by using a gradient descent method for the loss function according to the Q value calculated by the DQN algorithm;
the loss function is: l (L) ii )=E s,a~ρ(s,a) [(y i -Q(s,a;θ i )) 2 ];
Wherein Q (s, a; θ) i ) An estimated value representing Q (s, a); e represents the desire; y is i Represents Q (s, a); i is the number of iterations; θ i Representing the weight of the Q neural network under the ith iteration; s represents the current water quality state; a represents a current command for controlling the opening of the shutter.
Further, in S206, Q (S, a) obtained from the Bellman equation is denoted as y i ,y i The specific calculation formula of (2) is as follows: y is i =E s ′[r+γmax a ′Q i (s′ , a′ ; θ i - 1 )|s,a];
Wherein E is s ' indicating the desire for the current water quality status; r represents the value obtained after performing action a;max a ' expressed in all actions a ' the maximum Q value in; q (Q) i (s′ , a′ ; θ i -1) represents the Q value of each action a 'in the next state s' after execution of action a; the expression is under (s, a); gamma represents the attenuation coefficient; s' represents the state of water quality after the gate is opened; a' represents the action threshold for the opening of the next control gate.
Further, in S204, an optimal threshold selection function is defined as Q (S, a), and Q (S, a) is calculated by the function as follows: q (s, a) =max pi E [ r ] t |s t =s,a t =a|π];
Wherein E represents a desire; s represents the state of water quality; a is an instruction for controlling the opening of the gate in the state; pi represents action and state mapping; s is(s) t The water quality state is the water quality state at the time step t; a, a t Is s t Transmitting a gate opening instruction in a state; r is (r) t Is the water quality state s t And transmitting a feedback value obtained by the gate opening command a.
Further, in S204, a gate opening instruction with the largest Q value is selected according to the epsilon-greedy rule; the epsilon-greedy rule is to select the action with the largest Q value according to the probability of 1-epsilon, randomly select the action according to the probability of epsilon, and randomly explore an unknown state space.
Further, in S205, when the water quality state is S, a gate opening command a is transmitted to obtain a feedback value r t Feedback value r t The calculation formula of (2) is as follows:
wherein t represents the current time step; t' represents the time step of opening the gate; mu (mu) t’-t A discount factor representing a time step from when the shutter is opened to a current time step; r is (r) t’ A prize value representing a step in time when the gate is open; r is (r) t The feedback value is the t time step; mu represents the discount factor.
In a second aspect, the present invention provides a drainage network terminal intelligent drainage method system based on deep reinforcement learning, including: the intelligent water quality monitoring system comprises a water quality acquisition terminal, an intelligent drainage device, a remote control terminal and a visual platform;
the water quality acquisition terminal is used for acquiring real-time water quality data;
the remote control terminal is used for analyzing the collected real-time water quality data based on a pre-trained DQN model and controlling a gate of the intelligent drainage device to execute opening or closing actions according to analysis results;
the visual platform is used for visually displaying real-time water quality data and gate states.
Compared with the prior art, the invention has the following beneficial effects:
according to the invention, the intelligent drainage port is arranged at the tail end of the urban drainage pipe network, and the water quality acquisition terminal is arranged to acquire the water quality state at the drainage port in real time, the real-time water quality state is analyzed and judged according to the pre-trained DQN (deep reinforcement learning) model, whether the current water quality meets the emission standard is judged, the remote control gate is in a closed or open state, and the real-time water quality data and the gate state are visually displayed. The whole process does not need to participate in a human field, and the opening and closing of the sewage interception gate can be automatically adjusted in real time according to the sewage state, so that the manpower consumption is reduced, and the danger is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an intelligent drainage method at the tail end of a drainage pipe network based on deep reinforcement learning;
fig. 2 is a schematic structural diagram of an intelligent drainage system at the tail end of a drainage pipe network based on deep reinforcement learning.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, the embodiment of the invention discloses an intelligent drainage method for the tail end of a drainage pipe network based on deep reinforcement learning, which comprises the following steps:
s1, a water quality acquisition terminal is arranged at a drainage port at the tail end of a drainage pipe network in advance, and real-time water quality data are acquired; the water quality data includes: ammonia nitrogen content, total phosphorus content, COD data and TDS data;
s2, analyzing the collected real-time water quality data based on a pre-trained DQN model, and controlling a gate at the discharge port to execute opening or closing actions according to an analysis result; because the corresponding representative index exists at each water outlet, if the representative index is more than or equal to 90% of the index emission standard, the gate is opened, water flows to the sewage pipeline to the sewage treatment plant, and if the representative index is less than 90% of the index emission standard, the water is judged to reach the standard, the gate is closed, and the water flows to the river channel.
And S3, visually displaying the real-time water quality data and the gate state.
In a specific embodiment, in S2, the training process for the DQN model includes:
s201: initializing the weight theta, the iteration number threshold, the experience pool and the iteration number of the Q neural network of the DQN model.
S202: reading current water quality data, including: the data of ammonia nitrogen, total phosphorus, COD and TDS are established, and the initial state value s of the tail end discharge of the current drainage pipe network is established t
S203: for the current state value s t Judging, wherein the judging mode is as follows: if any index is greater than or equal to 90% of the standard value, continuing step S204 if any index is greater than or equal to 90% of the standard value, otherwiseAnd repeating the steps S202-S203.
S204: calculating a comprehensive pollution index, and if the comprehensive pollution index is higher than or equal to the emission standard, taking the Q value under the index as an optimal threshold for controlling the opening of the gate and controlling the gate action a t For on, continue step S205; if the comprehensive pollution index is lower than the emission standard, controlling the gate action a t And closing, and continuing steps S202-S204.
Because the sewage quality at each water outlet is different, typical pollutants are different, and an index with the highest linear correlation with the comprehensive pollution index is iterated from the four water quality data indexes according to the DQN algorithm to serve as a representative index of the corresponding water outlet, and the representative index at each water outlet can be different.
In this step, an optimal threshold selection function is defined as Q (s, a), and the Q value is calculated from this function, and the formula of Q (s, a) is: q (s, a) =max pi E [ r ] t |s t =s,a t =a|π];
The equation represents the largest expected cumulative expected value for state s action a in all policies pi, E represents the expected; s represents the state of water quality; a is an instruction for controlling the opening of the gate in the state; pi represents action and state mapping; s is(s) t The water quality state is the water quality state at the time step t; a, a t Is s t Transmitting a gate opening instruction in a state; r is (r) t Is the water quality state s t And transmitting a feedback value obtained by the gate opening command a.
Selecting a gate opening instruction with the maximum Q value according to an epsilon-greedy rule; the epsilon-greedy rule is to select the action with the largest Q value according to the probability of 1-epsilon, randomly select the action according to the probability of epsilon, and randomly explore an unknown state space.
S205: collecting water quality data of the tail end discharge port of the water discharge pipe network at the next time step to obtain a state s t+1 Feedback value r t Will(s) t , a t , r t , s t+1 ) And storing into an experience pool.
In the step, when the water quality state is s, a gate opening instruction a is transmitted to obtain a feedback value r t Feedback value r t The calculation formula of (2) is as follows:
wherein t represents the current time step; t' represents the time step of opening the gate; mu (mu) t’-t A discount factor representing a time step from when the shutter is opened to a current time step; r is (r) t’ A prize value representing a step in time when the gate is open; r is (r) t The feedback value is the t time step; mu represents the discount factor.
S206: judging whether the experience pool is full, if not, repeating S202-S205, and continuously collecting samples; if the training is full, training the Q neural network, repeatedly executing S202-S206, and updating the weight theta of the Q neural network until the preset training target is met.
In this step, the weight θ is updated by using a gradient descent method for the loss function according to the Q value calculated by the DQN algorithm.
The loss function is: l (L) ii )=E s,a~ρ(s,a) [(y i -Q(s,a;θ i )) 2 ];
Wherein Q (s, a; θ) i ) Representing an estimate of the Q (s, a) neural network; e (E) s,a~ρ(s,a) The subscripts s, a- ρ (s, a) represent the probability distribution between the water quality state s and the gate execution action a; y is i Representing the Q value obtained by the ith iteration; i is the number of iterations; θ i Representing the weight of the Q neural network under the ith iteration; s represents the current water quality state; a represents a current command for controlling the opening of the shutter.
Q (s, a) derived from the Bellman equation is denoted as y i
The Bellman equation is: v(s) =max a (R(s,a)+γV(s′));
Wherein R is a reward function; s is the water quality state at a specific time point; a is the action taken after the current state is calculated; v (s') is the discount cost function for the subsequent state; gamma is the attenuation coefficient; s' represents the subsequent water quality status; v(s) represents the cost function in the state of s at a particular point in time.
y i The specific calculation formula of (2) is as follows: y is i =E s ′[r+γmax a ′Q i (s′ , a′ ; θ i - 1 )|s,a];
Wherein E is 3 ' indicating the desire for the current water quality status; r represents the value obtained after performing action a; max (max) a ' expressed in all actions a ' the maximum Q value in; q (Q) i (s′ , a′ ; θ i -1) represents the Q value of each action a 'in the next state s' after execution of action a; the expression is under (s, a); gamma represents the attenuation coefficient; s' represents the state of water quality after the gate is opened; a' represents the action threshold for the opening of the next control gate.
In other embodiments, as shown in fig. 2, the present invention further provides an intelligent drainage system at the end of a drainage pipe network based on deep reinforcement learning, including: the intelligent water quality monitoring system comprises a water quality acquisition terminal, an intelligent drainage device, a remote control terminal and a visual platform;
the water quality acquisition terminal is used for acquiring real-time water quality data;
the remote control terminal is used for analyzing the collected real-time water quality data based on a pre-trained DQN model and controlling a gate of the intelligent drainage device to execute opening or closing actions according to an analysis result;
the visual platform is used for visually displaying real-time water quality data and gate states.
Specifically, the remote control terminal comprises a platform communication unit, a data processing unit, a data experience pool, a model server, a PLC controller and a gate starter; the water quality acquisition terminal and the platform communication unit are communicated through the RTU remote terminal, water quality data are transmitted to the data processing unit to carry out operations such as duplicate removal and screening, then the model server is utilized to train the DQN model, and a gate opening and closing instruction is sent to the PLC according to a judging result of the model, and the PLC controls the opening and closing state of the gate.
In addition, the water quality detection terminal is also integrated with a liquid level sensor for realizing liquid level acquisition at the tail end discharge port of the drainage pipe network, and simultaneously, the real-time power consumption of the water quality detection terminal can be monitored.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (8)

1. The intelligent drainage method for the tail end of the drainage pipe network based on deep reinforcement learning is characterized by comprising the following steps of:
s1, a water quality acquisition terminal is arranged at a drainage port at the tail end of a drainage pipe network in advance, and real-time water quality data are acquired;
s2, analyzing the collected real-time water quality data based on a pre-trained DQN model, and controlling a gate at the discharge port to execute opening or closing actions according to an analysis result; each water outlet is provided with a corresponding representative index, if the representative index is more than or equal to 90% of the index emission standard, the gate is opened, and if the representative index is less than 90% of the index emission standard, the water is judged to reach the standard, and the gate is closed;
s3, visually displaying the real-time water quality data and the gate state;
in S2, the training process for the DQN model includes:
s201: initializing the weight theta, iteration times threshold, experience pool and iteration times of a Q neural network of the DQN model;
s202: reading current water quality data, including: ammonia nitrogen, total phosphorus, COD and TDS numberAccording to the initial state value s of the end discharge port of the current drainage pipe network is established t
S203: for the current state value s t Judging, wherein the judging mode is as follows: if any index is more than or equal to 90% of the standard value of the index emission, continuing to step S204 if any index is more than or equal to 90% of the standard value of the index emission, and repeating steps S202-S203 if no index is more than or equal to 90% of the standard value of the index emission;
s204: calculating a comprehensive pollution index, and if the comprehensive pollution index is higher than or equal to the emission standard, taking the Q value under the index as an optimal threshold for controlling the opening of the gate and controlling the gate action a t For on, continue step S205; if the comprehensive pollution index is lower than the emission standard, controlling the gate action a t Closing and continuing the steps S202-S204;
iterating an index with the highest linear correlation with the comprehensive pollution index from the four water quality data indexes according to the DQN algorithm to serve as a representative index of the corresponding water outlet;
s205: collecting water quality data of the tail end discharge port of the water discharge pipe network at the next time step to obtain a state s t+1 Feedback value r t Will(s) t , a t , r t , s t+1 ) Storing into an experience pool;
s206: judging whether the experience pool is full, if not, repeating S202-S205; if the training is full, the Q neural network is trained repeatedly, S202-S206 are executed repeatedly, and the weight theta of the Q neural network is updated until a preset training target is met.
2. The deep reinforcement learning-based intelligent drainage method at the tail end of a drainage pipe network according to claim 1, wherein the water quality data comprises: ammonia nitrogen content, total phosphorus content, COD data and TDS data.
3. The deep reinforcement learning-based intelligent drainage method at the end of a drainage pipe network according to claim 1, wherein in S206, the weight θ is updated by using a gradient descent method on the loss function according to the Q value calculated by the DQN algorithm;
the loss function is: l (L) ii )=E s,a~ρ(s,a) [(y i -Q(s,a;θ i )) 2 ];
Wherein Q (s, a; θ) i ) An estimated value representing Q (s, a); e (E) s,a~ρ(s,a) The subscripts s, a- ρ (s, a) represent the probability distribution between the water quality state s and the gate execution action a; y is i Represents Q (s, a); i is the number of iterations; θ i Representing the weight of the Q neural network under the ith iteration; s represents the current water quality state; a represents a current command for controlling the opening of the shutter.
4. The intelligent drainage method based on deep reinforcement learning of the drainage network end of claim 3, wherein in S206, Q (S, a) obtained according to Bellman equation is denoted as y i ,y i The specific calculation formula of (2) is as follows: y is i =E s ′[r+γmax a ′Q i (s′ , a′ ; θ i - 1 )|s,a];
Wherein E is s ' indicating the desire for the current water quality status; r represents the value obtained after performing action a; max (max) a ' expressed in all actions a ' the maximum Q value in; q (Q) i (s′ , a′ ; θ i -1) represents the Q value of each action a 'in the next state s' after execution of action a; the expression is under (s, a); gamma represents the attenuation coefficient; s' represents the state of water quality after the gate is opened; a' represents the action threshold for the opening of the next control gate.
5. The intelligent drainage method of a drainage network terminal based on deep reinforcement learning according to claim 1, wherein in S204, an optimal threshold selection function is defined as Q (S, a), and the formula of Q (S, a) is as follows: q (s, a) =max pi E [ r ] t |s t =s,a t =a|π];
Wherein E represents a desire; s represents the state of water quality; a is an instruction for controlling the opening of the gate in the state; pi represents action and state mapping; s is(s) t The water quality state is the water quality state at the time step t; a, a t Is s t In the state of firing gate openingAn instruction; r is (r) t Is the water quality state s t And transmitting a feedback value obtained by the gate opening command a.
6. The intelligent drainage method of the drainage pipe network end based on the deep reinforcement learning according to claim 1, wherein in S204, a gate opening instruction with the largest Q value is selected according to epsilon-greedy rule; the epsilon-greedy rule is to select the action with the largest Q value according to the probability of 1-epsilon, randomly select the action according to the probability of epsilon, and randomly explore an unknown state space.
7. The intelligent drainage method of the drainage pipe network end based on deep reinforcement learning according to claim 1, wherein in S205, when the water quality state is S, a gate opening instruction a is transmitted, and the obtained feedback value r is obtained t Feedback value r t The calculation formula of (2) is as follows:
wherein t represents the current time step; t' represents the time step of opening the gate; mu (mu) t’-t A discount factor representing a time step from when the shutter is opened to a current time step; r is (r) t’ A prize value representing a step in time when the gate is open; r is (r) t The feedback value is the t time step; mu represents the discount factor.
8. Intelligent drainage system at end of drainage pipe network based on degree of depth reinforcement study, its characterized in that includes: the intelligent water quality monitoring system comprises a water quality acquisition terminal, an intelligent drainage device, a remote control terminal and a visual platform;
the water quality acquisition terminal is used for acquiring real-time water quality data;
the remote control terminal is used for analyzing the collected real-time water quality data based on a pre-trained DQN model and controlling a gate of the intelligent drainage device to execute opening or closing actions according to analysis results; each water outlet is provided with a corresponding representative index, if the representative index is more than or equal to 90% of the index emission standard, the gate is opened, and if the representative index is less than 90% of the index emission standard, the water is judged to reach the standard, and the gate is closed;
the training process for the DQN model comprises:
s201: initializing the weight theta, iteration times threshold, experience pool and iteration times of a Q neural network of the DQN model;
s202: reading current water quality data, including: the data of ammonia nitrogen, total phosphorus, COD and TDS are established, and the initial state value s of the tail end discharge of the current drainage pipe network is established t
S203: for the current state value s t Judging, wherein the judging mode is as follows: if any index is more than or equal to 90% of the standard value of the index emission, continuing to step S204 if any index is more than or equal to 90% of the standard value of the index emission, and repeating steps S202-S203 if no index is more than or equal to 90% of the standard value of the index emission;
s204: calculating a comprehensive pollution index, and if the comprehensive pollution index is higher than or equal to the emission standard, taking the Q value under the index as an optimal threshold for controlling the opening of the gate and controlling the gate action a t For on, continue step S205; if the comprehensive pollution index is lower than the emission standard, controlling the gate action a t Closing and continuing the steps S202-S204;
iterating an index with the highest linear correlation with the comprehensive pollution index from the four water quality data indexes according to the DQN algorithm to serve as a representative index of the corresponding water outlet;
s205: collecting water quality data of the tail end discharge port of the water discharge pipe network at the next time step to obtain a state s t+1 Feedback value r t Will(s) t , a t , r t , s t+1 ) Storing into an experience pool;
s206: judging whether the experience pool is full, if not, repeating S202-S205; if the training is full, repeatedly training the Q neural network, repeatedly executing S202-S206, and updating the weight theta of the Q neural network until the preset training target is met;
the visual platform is used for visually displaying real-time water quality data and gate states.
CN202311102920.XA 2023-08-30 2023-08-30 Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning Active CN116819974B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311102920.XA CN116819974B (en) 2023-08-30 2023-08-30 Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311102920.XA CN116819974B (en) 2023-08-30 2023-08-30 Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN116819974A CN116819974A (en) 2023-09-29
CN116819974B true CN116819974B (en) 2023-11-03

Family

ID=88120686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311102920.XA Active CN116819974B (en) 2023-08-30 2023-08-30 Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN116819974B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111489052A (en) * 2020-03-10 2020-08-04 上海水顿智能科技有限公司 Method for carrying out intercepting drainage scheduling by utilizing water quality and water quantity
CN113850692A (en) * 2021-09-26 2021-12-28 天津大学 Urban water supply system gate pump group optimal scheduling method based on deep learning
CN115761316A (en) * 2022-11-08 2023-03-07 中国长江电力股份有限公司 Hydropower station flat gate opening and closing method based on YOLO automatic identification
CN116187208A (en) * 2023-04-27 2023-05-30 深圳市广汇源环境水务有限公司 Drainage basin water quantity and quality joint scheduling method based on constraint reinforcement learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220328189A1 (en) * 2021-04-09 2022-10-13 Arizona Board Of Regents On Behalf Of Arizona State University Systems, methods, and apparatuses for implementing advancements towards annotation efficient deep learning in computer-aided diagnosis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111489052A (en) * 2020-03-10 2020-08-04 上海水顿智能科技有限公司 Method for carrying out intercepting drainage scheduling by utilizing water quality and water quantity
CN113850692A (en) * 2021-09-26 2021-12-28 天津大学 Urban water supply system gate pump group optimal scheduling method based on deep learning
CN115761316A (en) * 2022-11-08 2023-03-07 中国长江电力股份有限公司 Hydropower station flat gate opening and closing method based on YOLO automatic identification
CN116187208A (en) * 2023-04-27 2023-05-30 深圳市广汇源环境水务有限公司 Drainage basin water quantity and quality joint scheduling method based on constraint reinforcement learning

Also Published As

Publication number Publication date
CN116819974A (en) 2023-09-29

Similar Documents

Publication Publication Date Title
CN111222698B (en) Internet of things-oriented ponding water level prediction method based on long-time and short-time memory network
CN103903452B (en) Forecasting Approach for Short-term Traffic Flow
CN105512767B (en) A kind of Flood Forecasting Method of more leading times
CN104318325B (en) Many basin real-time intelligent water quality prediction methods and system
CN108898215B (en) Intelligent sludge bulking identification method based on two-type fuzzy neural network
CN107480775A (en) A kind of dissolved oxygen in fish pond Forecasting Methodology based on data reparation
CN104899653B (en) Lake storehouse blue-green alga bloom Forecasting Methodology based on expert system and blue algae growth mechanism temporal model
CN108876021B (en) Medium-and-long-term runoff forecasting method and system
CN111652425A (en) River water quality prediction method based on rough set and long and short term memory network
CN106200381B (en) A method of according to the operation of processing water control by stages water factory
CN111695290A (en) Short-term runoff intelligent forecasting hybrid model method suitable for variable environment
CN112330065A (en) Runoff forecasting method based on basic flow segmentation and artificial neural network model
CN107229970B (en) The adaptive dynamic self study on-line monitoring system of shared direct drinking water quality
CN108710964A (en) A kind of prediction technique of Fuzzy time sequence aquaculture water quality environmental data
CN114942948A (en) Drainage pipe network diagnosis and management method
CN116819974B (en) Intelligent drainage method and system for tail end of drainage pipe network based on deep reinforcement learning
KR101585545B1 (en) A method of Wavelet-based autoregressive fuzzy modeling for forecasting algal blooms
CN108894308B (en) Reservoir and water quality monitoring method thereof
CN103955743A (en) Ultrahigh-pressure water jet road mark line cleaning effect forecasting method and device
CN107192802B (en) Shared direct drinking on-line water quality monitoring method and system
CN115618720A (en) Soil salinization prediction method and system based on altitude
CN115204688A (en) Comprehensive evaluation method for health of drainage system
WO2022032873A1 (en) Adversarial neural network-based hydrological parameter calibration method for data-lacking region
CN113723708A (en) Urban daily water consumption prediction method based on machine learning
CN109313416B (en) Method, computer program product and system for dynamically controlling a fluid network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant