CN115423224B - Secondary water supply amount prediction method and device based on big data and storage medium - Google Patents
Secondary water supply amount prediction method and device based on big data and storage medium Download PDFInfo
- Publication number
- CN115423224B CN115423224B CN202211373554.7A CN202211373554A CN115423224B CN 115423224 B CN115423224 B CN 115423224B CN 202211373554 A CN202211373554 A CN 202211373554A CN 115423224 B CN115423224 B CN 115423224B
- Authority
- CN
- China
- Prior art keywords
- water consumption
- water
- residents
- resident
- time period
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 title claims abstract description 399
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000003066 decision tree Methods 0.000 claims abstract description 28
- 239000008400 supply water Substances 0.000 claims abstract description 20
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 15
- 238000004364 calculation method Methods 0.000 claims description 17
- 239000000126 substance Substances 0.000 claims description 17
- 238000013138 pruning Methods 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 11
- 238000010276 construction Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 abstract description 6
- 230000007246 mechanism Effects 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 3
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- VZGDMQKNWNREIO-UHFFFAOYSA-N tetrachloromethane Chemical compound ClC(Cl)(Cl)Cl VZGDMQKNWNREIO-UHFFFAOYSA-N 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 1
- IOVCWXUNBOPUCH-UHFFFAOYSA-M Nitrite anion Chemical compound [O-]N=O IOVCWXUNBOPUCH-UHFFFAOYSA-M 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000037213 diet Effects 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 239000003651 drinking water Substances 0.000 description 1
- 235000020188 drinking water Nutrition 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- WPBNNNQJVZRUHP-UHFFFAOYSA-L manganese(2+);methyl n-[[2-(methoxycarbonylcarbamothioylamino)phenyl]carbamothioyl]carbamate;n-[2-(sulfidocarbothioylamino)ethyl]carbamodithioate Chemical compound [Mn+2].[S-]C(=S)NCCNC([S-])=S.COC(=O)NC(=S)NC1=CC=CC=C1NC(=S)NC(=O)OC WPBNNNQJVZRUHP-UHFFFAOYSA-L 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000003911 water pollution Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Marketing (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application discloses a secondary water supply amount prediction method based on big data, equipment and a storage medium, and belongs to the technical field of big data prediction algorithms. The method comprises the following steps: establishing a single-family resident water consumption prediction model; the method specifically comprises the following steps: obtaining water consumption of single household residents generating a data set by the data; dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on all the acquired resident water consumption data; training according to a decision tree algorithm to obtain a single-family resident water consumption prediction model; predicting the water consumption of single-user residents based on the single-user resident water consumption prediction model, and calculating the water consumption prediction value of the secondary water supply water consumption point according to the water consumption of the single-user residents; and if the predicted water consumption value of the single-user residents and the actual water consumption value of the single-user residents do not belong to the same water consumption grade, correcting the single-user resident water consumption prediction model. The method can solve the problems of simple model input parameters and lack of a feedback optimization mechanism in the conventional secondary water supply amount prediction model.
Description
Technical Field
The invention relates to the technical field of big data prediction algorithms, in particular to a big data-based secondary water supply amount prediction method, equipment and a storage medium.
Background
The secondary water supply is a form that a unit or an individual stores and pressurizes urban public water supply or self-built facility water supply, and supplies water to users or self-use through a pipeline. The secondary water supply is mainly established for compensating the pressure shortage of the municipal water supply pipeline and better ensuring the water consumption of the people living in the dwelling and living higher. The water quality, water pressure and water supply safety of secondary water supply are closely related to the normal and stable life of people.
The main forms of the secondary water supply are 2, the first form is the secondary water supply without an underground water tank and a water pump for pressurization, such as a roof water tank, and the form is suitable for low-rise buildings; the second type is a secondary water supply with an underground water tank and a water pump for pressurization, for example, the secondary water supply is pressurized and then passes through a roof water tank, an air pressure sight and a variable-frequency speed-regulating water pump, namely, a 'water tank-water pump-water tank' combined water supply mode, and the secondary water supply in the form is more widely used at present.
Compared with raw water supply, secondary water supply is easier to infect water quality, and mainly causes the turbidity of water in a pool, a water tank and peripheral water to be increased, the content of harmful substances to human bodies such as bacteria, colibacillus, iron, manganese, carbon tetrachloride and nitrite in the water is increased due to poor management, overlong water storage time and the like in the water storage process of the pool or the water tank. Therefore, the water supply amount of the secondary water supply needs to be strictly controlled, so that the phenomenon of insufficient water pressure due to insufficient secondary water supply amount cannot occur, and the phenomenon of water pollution caused by the increase of water age due to overlong water storage time due to excessive secondary water supply amount cannot occur. Meanwhile, the health management and supervision are required to be enhanced, and the water is cleaned and disinfected regularly to ensure the drinking water health of residents.
Disclosure of Invention
The invention aims to provide a secondary water supply amount prediction method, equipment and a storage medium based on big data, which can solve the problems of simple model input parameters and lack of a feedback optimization mechanism in the secondary water supply demand amount prediction technology of the existing water consumption site, and ensure that the water consumption of residents is predicted more accurately and meticulously.
The invention provides a secondary water supply amount prediction method based on big data, which comprises the following steps:
establishing a single-family resident water consumption prediction model; the method specifically comprises the following steps:
acquiring water consumption data of single-family residents to generate a data set;
dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on all the acquired resident water consumption data;
training according to a decision tree algorithm to obtain a single-family resident water consumption prediction model;
predicting the water consumption of single-user residents based on the single-user resident water consumption prediction model, and calculating according to the water consumption of the single-user residents to obtain the water consumption of all residents, namely the water consumption prediction value of the secondary water supply water consumption point;
and if the predicted water consumption value of the single-user residents and the actual water consumption value of the single-user residents do not belong to the same water consumption grade, correcting the single-user resident water consumption prediction model.
Preferably, the acquiring the water consumption data of the single household to generate the data set specifically includes:
the data to be collected includes: time periodBased on the mean temperature>And weather->Season, season>And date->And the water slope is used for the same time period on the previous day>Based on the water quantity in the current time period>;
The time of day is divided into 4 time periods, each of which has 6 hours, namely (00;
the water consumption of each single household in each time period is calculated by the formula, and the formula is as follows:
wherein the content of the first and second substances,for a time period of the last day>Is measured by the water usage slope value of (4)>Is time period->Based on the water quantity>Is based on the time period of the previous day>The amount of water to be used is,
is a time difference of a time period>Is time period->The water meter reading of the single-family resident at the last time node of (4) is recorded in the recording device>Is time period->Is taken into the previous time period of (4), (v) is taken>Is time period->The water meter reading of the single household resident at the last time node of the last time period.
Preferably, the dividing the total domestic water consumption into 5 grades according to the normal distribution and performing grade marking on the obtained total domestic water consumption data comprises:
dividing all the resident water consumption into 5 intervals, wherein the intervals respectively correspond to 5 grades of all the resident water consumption, the probability of each grade is 0.2, and the probability density function formula of normal distribution is as follows:
wherein, the first and the second end of the pipe are connected with each other,mean water consumption for all residents>For the total number of samples in the data set D>Variance of water usage for all residents, <' > based on the total water usage>Is a sample>E is the natural logarithm>For all residents, a water consumption probability density function, wherein>Water quantity for the sample, based on the measured value>;
The grade probability of all the resident water consumption is 0.2, namely the integral of the probability density function of all the resident water consumption in each grade interval is 0.2, and the integral calculation formula is as follows:
wherein, the first and the second end of the pipe are connected with each other,represents the upper and lower limits of the interval; />Indicates a section->The probability of (d);
integration according to the first grade interval equals 0.2, i.e.The integral of the probability density function is 0.2, and the first grade node for the water consumption of all residents can be obtained>And so on to obtain grade nodesThe value of (2) can be obtained, and 5 grade intervals of all the water consumption of residents can be obtainedAnd 5 grade intervals of all the water consumption of residents are respectively as follows: />、/>、/>、/>、。
Preferably, the training according to the decision tree algorithm to obtain the single-family residential water prediction model comprises:
the time period t, the average temperature temp, the weather, the season, the date d, the same time period of the previous day and the water slope are respectively used as attributes of the decision tree, the splitting attribute of the current decision tree node is determined according to the splitting information, the splitting information gain and the splitting information gain rate of each attribute, and the splitting information, the splitting information gain and the splitting information gain rate of each attribute are respectively calculated according to the following calculation formula:
wherein the content of the first and second substances,is attribute->Set of training data->Division into->Sub-data set, <' > or>Is the first->Number of samples in the subset; />For pre-division training data set->Is greater than or equal to>Represents an attribute pick>The splitting information of (a);
wherein the content of the first and second substances,water level for all residents for the classification label, based on the water level>For a classification label>In the training data setIn (d) is present in>For the number of classification tags, is>The water consumption of 5 residents is constant, namely 5 water consumption grades;for a training data set->In the entropy of the information in (b), in combination with>Is attribute->The entropy of the information after the split is,is attribute->Splitting training data set->The latter information gain;
wherein, the first and the second end of the pipe are connected with each other,is attribute->Splitting training data set->Later information gain rate, attribute>Comprising an attribute->An attribute @>An attribute @>And attribute->And attribute->For attributesAnd attribute->And attribute->And attribute->And attribute->The attributes are obtained through the calculation in sequenceAn attribute @>And attribute->And attribute->An attribute @>The information gain rate of (d);
selecting the attribute with the maximum information gain rate as the splitting attribute of the node of the current decision tree, and gradually selecting the node splitting attribute of the next level of the decision tree by a recursive method until the construction of a decision tree model is completed, namely generating an initial single-family resident water prediction model;
solving the overfitting problem of the decision tree algorithm by a PEP pruning method, firstly, the misjudgment rate of the subtree to be pruned needs to be obtained1, the specific calculation formula is as follows:
wherein, the first and the second end of the pipe are connected with each other,is a penalty factor, is an adjustable parameter>Is a leaf node of a subtree>Is the number of leaf nodes of a subtree>The number of misjudged samples for a leaf node of a subtree>The number of samples for a leaf node;
mean value of the misjudgment times of the subtreesAnd standard deviation->Comprises the following steps:
assuming that a sub-tree is replaced by a leaf node, namely, the root node of the sub-tree is used as a leaf node, and the rest nodes of the sub-tree are deleted, the misjudgment rate of the leaf nodeComprises the following steps:
the pruning conditions are as follows:
and when the pruning condition is met, executing pruning operation, taking the root node of the subtree as a leaf node, deleting the other nodes of the subtree, and performing pruning calculation to obtain a complete single-family resident water use prediction model.
Preferably, the water consumption of the single-user residents is predicted based on the single-user resident water prediction model, and the water consumption of all the residents, namely the water consumption prediction value of the secondary water supply water consumption point, is obtained through calculation according to the water consumption of the single-user residents; the method comprises the following steps:
time periodBased on the mean temperature>And weather->Season, season>And date->And the water slope is used for the same time period on the previous day>The water consumption is input into a single-household resident water consumption prediction model as an input parameter, and the water consumption of a single-household resident in 4 time periods in one day is predicted;
and accumulating the water consumption of all residents in all time periods to obtain the water consumption of all residents, namely the daily water consumption prediction value of the secondary water supply water consumption point.
Preferably, if the predicted value of the water consumption of the single-user residents and the actual value of the water consumption of the single-user residents do not belong to the same water consumption level, the modifying the prediction model of the water consumption of the single-user residents comprises the following steps:
obtaining water consumption of single household resident in one dayTrue value of quantityThe actual water consumption value matrix of the single-family residents in one day is as follows:
is a single household resident>The real water consumption value of a single household resident in a time period is judged and judged>Is a single household residents>The real water consumption value of a single household resident in a time period is judged and judged>Is a single household resident>True water consumption value for a single household resident in a time period, based on>For a single household resident>The real water consumption value of a single household resident in a time period is the predicted value of the water consumption of the single household resident in one day>And the real water consumption value of single household resident in one dayMarking the water consumption grade, comparing whether the water consumption in the time periods corresponding to the predicted value and the actual value belongs to the same water consumption grade, and when the predicted value and the actual value of all the corresponding time periods belong to the same water consumption grade, the single-user resident water prediction model reaches the use standard;
and if the use standard is not met, the step of establishing the single-family resident water use prediction model is executed again.
The second purpose of the invention can be achieved by adopting the following technical scheme:
a computer device comprising a processor and a memory for storing a processor executable program, the processor implementing the above-described big data based secondary water supply amount prediction method when executing the program stored in the memory.
The third purpose of the invention can be achieved by adopting the following technical scheme:
a storage medium storing a program which, when executed by a processor, implements the above-described secondary water supply amount prediction method based on big data.
Compared with the prior art for predicting the daily water consumption of the whole secondary water supply water consumption point, the method for predicting the daily water consumption of the secondary water supply water consumption point predicts the water consumption of each household from the bottommost layer, and stored data are more detailed and can cope with more complex scenes. Compared with the prior art that only one-time water use prediction model item is trained, the method adopts a feedback optimization mechanism of random spot check to periodically check the accuracy and the robustness of the model, and simultaneously feeds back the check result to a secondary water supply control center to perform iterative optimization updating on the existing water use prediction model.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
FIG. 1 is a flow chart of a big data based secondary water supply prediction method according to the present invention;
FIG. 2 is a schematic diagram of a secondary water supply system according to the present invention;
FIG. 3 is a diagram illustrating a four-dimensional structure of a data set D according to a big data-based method for predicting secondary water supply according to the present invention;
fig. 4 is a flow chart of a method for predicting secondary water supply based on big data according to the present invention, and the method establishes a prediction model for water consumption of single household.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that all directional indicators (such as up, down, left, right, front, back ...) in the embodiments of the present invention are only used to explain the relative positional relationship between the components, the motion situation, etc. in a specific posture (as shown in the attached drawings), and if the specific posture is changed, the directional indicator is changed accordingly.
In addition, the descriptions related to "first", "second", etc. in the present invention are for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
The invention discloses a secondary water supply amount prediction method based on big data, which comprises the following steps with reference to figures 1-4:
step 102, dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on all the acquired resident water consumption data;
103, training according to a decision tree algorithm to obtain a single-family resident water consumption prediction model;
and step 300, if the predicted water consumption value of the single-user residents and the actual water consumption value of the single-user residents do not belong to the same water consumption grade, correcting the single-user resident water consumption prediction model.
Compared with the prior art for predicting the daily water consumption of the whole secondary water supply water consumption point, the method for predicting the daily water consumption of the secondary water supply water consumption point predicts the water consumption of each household from the bottommost layer, and stored data are more detailed and can cope with more complex scenes. Compared with the prior art that only one-time water use prediction model item is trained, the method adopts a feedback optimization mechanism of random spot check to periodically check the accuracy and the robustness of the model, and simultaneously feeds back the check result to a secondary water supply control center to perform iterative optimization updating on the existing water use prediction model. When the predicted water consumption is small, the water supply amount is reduced, so that the water consumption of a user is further reduced, the water consumption under the condition can be used as an index to cause the prediction model to fall into a cycle with the continuously reduced water supply amount, and the problem that the prediction model falls into a vicious cycle is solved by using the local minimum average water consumption as the water level minimum threshold value.
The secondary water supply system structure of the invention is shown in figure 2, which comprises 4 nodes of a secondary water supply point, a secondary water supply water consumption control center, a secondary water supply water consumption point and a water using resident, wherein the secondary water supply point and the secondary water supply water consumption control center are positioned at the same physical position; the secondary water supply point is mainly responsible for water supply tasks and can adjust the water supply amount through an automatic control valve; the secondary water supply control center is used as an intelligent brain of the water supply system, is mainly responsible for the functions of information interaction, data storage and algorithm calculation, and intelligently controls the automatic control valve to adjust the water supply amount; the secondary water supply water consumption point provides secondary water supply for residents using water, such as communities, gardens and the like, and water level data can be sent to a secondary water supply water consumption control center through a water level sensor; the water consumption residents automatically recognize the reading of the water meter through the water meter reading robot and send the reading data to the secondary water supply water control center.
the data to be collected includes: time periodBased on the mean temperature>And weather->Season, season>Date, dateAnd the water slope is used for the same time period on the previous day>Based on the water quantity in the current time period>;
The time of day is divided into 4 time periods, each of which has 6 hours, namely (00;
the water consumption of single-family residents in each time period is calculated through a formula, wherein the formula is as follows:
wherein the content of the first and second substances,is based on the time period of the previous day>Is measured by the water usage slope value of (4)>Is time period->Based on the water quantity>Is based on the time period of the previous day>The amount of water to be used is,
is a time difference of a time period>Is a time period>The water meter reading of the single household resident at the last time node is judged and judged>Is time period->Is taken into the previous time period of (4), (v) is taken>Is time period->The water meter reading of the single household resident at the last time node of the last time period.
After one week of the raw data acquisition and accumulation work, a data set D is obtained, and FIG. 3 is a four-dimensional structure example diagram of the data set D, whereinRespectively represent time periods (00]、(06:00,12:00]、(12:00,18:00]、(18:00,24:00];/>Is time period->Is taken on average>Is time period->Based on the weather condition of (4), is greater than or equal to>Is time period->In the season of (4)>Is time period->The date of the day,is a time period>On the same day with the water slope, and/or>Is time period->The water consumption of single-family residents; each parameter matrix of a single time period of the single-family residents is a one-dimensional matrix, each parameter matrix of 4 time periods of a day of the single-family residents is a two-dimensional matrix, each parameter matrix of 4 time periods of a day of all users in an area is a three-dimensional matrix, and each parameter matrix of 4 time periods of all users in the area in a period is a four-dimensional matrix. And (3) taking the parameter time period, the average temperature, the weather, the season, the date and the water use slope of the same time period in the last day as input, taking the water use of residents in the current time period as a prediction result, and training a secondary water supply water use prediction model.
Step 102, dividing all the resident water consumption into 5 grades according to normal distribution, and performing grade marking on all the acquired resident water consumption data, which specifically comprises the following steps:
normal distribution is a very common continuous probability distribution, and when a research object has a reference and fluctuates up and down on the reference with a certain amplitude, so that the characteristics of dense middle and sparse sides are formed, the research object meets the normal distribution. According to the living water habit of residents, the living water habit is basically maintained on a stable reference point, and the living water difference of residents can occur along with factors such as home time, diet difference and the like, so that the characteristic of normal distribution is obviously satisfied.
Dividing all the resident water consumption into 5 intervals, respectively corresponding to 5 grades of all the resident water consumption, wherein the probability of each grade is 0.2, and the probability density function formula of normal distribution is as follows:
wherein the content of the first and second substances,mean water consumption for all residents>Is the total number of samples in data set D, <' >>Variance of water consumption for all residents, <' > based on the total water consumption>Is a sample->E is the natural logarithm>For all residents, a water consumption probability density function, wherein>Water quantity for the sample, based on the measured value>;
The grade probability of all the water consumption of the residents is 0.2, namely the integral of the probability density function of all the water consumption of the residents in each grade interval is 0.2, and the integral calculation formula is as follows:
wherein the content of the first and second substances,represents the upper and lower limits of the interval; />Indicates a section->The probability of (d);
the integral according to the first class interval is equal to 0.2, i.e.The integral of the probability density function is 0.2, and the first grade node for the water consumption of all residents can be obtained>And so on to find the rank node ^ er>The value of (2) can be obtained 5 grade intervals of all the resident water consumption, and the 5 grade intervals of all the resident water consumption are respectively: />、/>、/>、、/>。
Preferably, in step 103, the method for obtaining the single-family residential water prediction model by training according to the decision tree algorithm includes: the time period t, the average temperature temp, the weather, the season, the date d, the same time period of the previous day and the water slope are respectively used as attributes of the decision tree, the splitting attribute of the current decision tree node is determined according to the splitting information, the splitting information gain and the splitting information gain rate of each attribute, and the splitting information, the splitting information gain and the splitting information gain rate of each attribute are respectively calculated according to the following calculation formula:
wherein the content of the first and second substances,as an attribute>Set of training data->Divide into->Sub-data set, <' > or>Is the first->Number of samples in the subset; />For pre-division training data set->Is greater than or equal to>Represents an attribute pick>The splitting information of (a);
wherein the content of the first and second substances,water level for all residents for the classification label, based on the water level>For the classification label->In the training data setIs present at a frequency of->For the number of classification tags, is>Constant 5, i.e. 5 for all residentsWater quantity grade;for a training data set->In the entropy of the information in (b), in combination with>Is attribute->The entropy of the information after the split is,is attribute->Splitting training data set->The latter information gain;
wherein the content of the first and second substances,is attribute->Splitting training data set->Later information gain rate, attribute>Comprises that attribute->And attribute->And attribute->And attribute->And attribute->For the attributeAn attribute @>And attribute->And attribute->And attribute->The attributes are obtained through the calculation in sequenceAnd attribute->And attribute->An attribute @>And attribute->The information gain rate of (d);
selecting the attribute with the maximum information gain rate as the splitting attribute of the node of the current decision tree, and gradually selecting the node splitting attribute of the next level of the decision tree by a recursive method until the construction of a decision tree model is completed, namely generating an initial single-family resident water prediction model;
solving the overfitting problem of the decision tree algorithm by a PEP pruning method, firstly, the misjudgment rate of the subtree to be pruned needs to be obtained 1 The specific calculation formula is as follows:
wherein, the first and the second end of the pipe are connected with each other,is an adjustable parameter as a penalty factor>Is a leaf node of a subtree>The number of the leaf nodes of the subtree, device for selecting or keeping>The number of misjudged samples for a leaf node of a subtree>The number of samples for a leaf node;
mean value of the misjudgment times of the subtreesAnd standard deviation>Comprises the following steps:
assuming that a sub-tree is replaced by a leaf node, namely, the root node of the sub-tree is used as a leaf node, and the rest nodes of the sub-tree are deleted, the misjudgment rate of the leaf nodeComprises the following steps: />
the pruning conditions are as follows:
and when the pruning condition is met, executing pruning operation, taking the root node of the subtree as a leaf node, deleting the other nodes of the subtree, and performing pruning calculation to obtain a complete single-family resident water use prediction model.
Preferably, the step 200 of predicting the water consumption of the single-user residents based on the single-user residential water prediction model, and calculating the water consumption of all the residents according to the water consumption of the single-user residents to obtain the predicted water consumption value of the secondary water supply point, includes:
step 201, time periodAverage temperature->Weather>Season, season>Date/or>And the water slope is used for the same time period on the previous day>The water consumption is input into a single-household resident water consumption prediction model as an input parameter, and the water consumption of a single-household resident in 4 time periods in one day is predicted;
and step 202, accumulating the water consumption of all residents in all time periods to obtain the water consumption of all residents, namely the daily water consumption prediction value of the secondary water supply water consumption point.
Preferably, in step 300, if the predicted value of the water consumption of the individual household residents and the actual value of the water consumption of the individual household residents do not belong to the same water consumption level, the method for predicting the water consumption of the individual household residents by modifying the water consumption prediction model of the individual household residents specifically comprises the following steps:
301, acquiring the real water consumption value of a single household resident in one dayThe actual water consumption value matrix of the single household resident in one day is as follows:
is a single household resident>The actual water consumption of single-family residents in the time period,is a single household resident>The real water consumption value of a single household resident in a time period is judged and judged>For a single household resident>Single family resident in time period the real value of the water consumption is obtained, device for selecting or keeping>Is a single household resident>The real water consumption value of a single household resident in a time period is the predicted value of the water consumption of the single household resident in one day>And the real water consumption value in one day of the single household resident>Marking the water consumption grade, comparing whether the water consumption in the time periods corresponding to the predicted value and the actual value belongs to the same water consumption grade, and when the predicted value and the actual value of all the corresponding time periods belong to the same water consumption grade, the single-user resident water prediction model reaches the use standard;
and step 302, if the use standard is not met, re-establishing the single-household residential water use prediction model.
The invention can predict the water consumption of single household residents in one dayAnd the real water consumption value in one day of the single household resident>And marking the grade of the water consumption, and comparing whether the water consumption in the time period corresponding to the predicted value and the actual value belongs to the same water consumption grade. When the predicted values and the true values of all the corresponding time periods belong to the same water consumption grade, the residential water consumption prediction model reaches the use standard, the residential water consumption prediction model is fixed, the water consumption of a single household is predicted at 0 point every morning, and the total water consumption prediction value of the secondary water supply point is obtained through summarization; otherwise, continuing data acquisition and storing the data in the database, and repeatedly executing the establishment of the prediction water supply model until the residential water consumption prediction model reaches the use standard.
Example 2:
the embodiment provides a computer device, which comprises a processor and a memory for storing an executable program of the processor, wherein when the processor executes the program stored in the memory, the establishment of the single-household residential water prediction model in the embodiment 1 is realized; the method specifically comprises the following steps: acquiring water consumption data of single-family residents to generate a data set;
dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on all the acquired resident water consumption data; training according to a decision tree algorithm to obtain a single-family resident water consumption prediction model; predicting the water consumption of single-household residents based on a single-household residential water prediction model, and calculating according to the water consumption of the single-household residents to obtain the water consumption of all the residents, namely the water consumption prediction value of the secondary water supply water consumption point; and if the predicted value of the water consumption of the single-user residents and the actual value of the water consumption of the single-user residents do not belong to the same water consumption grade, correcting the prediction model of the water consumption of the single-user residents.
Example 3:
the present embodiment provides a storage medium, which is a computer-readable storage medium, and stores a computer program, when the program is executed by a processor, and the processor executes the computer program stored in the memory, the method of establishing the single-family residential water consumption prediction model according to embodiment 1 is implemented; the method specifically comprises the following steps: acquiring water consumption data of single-family residents to generate a data set; dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on all the acquired resident water consumption data; training according to a decision tree algorithm to obtain a single-family resident water consumption prediction model; predicting the water consumption of single-household residents based on a single-household residential water prediction model, and calculating according to the water consumption of the single-household residents to obtain the water consumption of all the residents, namely the water consumption prediction value of the secondary water supply water consumption point; and if the predicted water consumption value of the single-user residents and the actual water consumption value of the single-user residents do not belong to the same water consumption grade, correcting the single-user resident water consumption prediction model.
The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (5)
1. A secondary water supply amount prediction method based on big data is characterized by comprising the following steps:
establishing a residential water prediction model; the method specifically comprises the following steps:
acquiring resident water consumption data to generate a data set;
the acquiring the residential water data and generating the data set specifically comprises the following steps:
the data to be collected includes: time period of water useBased on the mean temperature>And weather->Season, season>Date, dateAnd the water slope is used for the same time period on the previous day>Present time period water usage>;
The time of day is divided into 4 time periods, each of which has 6 hours, namely (00;
the water consumption of each household in each time period is calculated by the formula, and the formula is as follows:
wherein the content of the first and second substances,is the same time period on the previous day>Is measured by the water usage slope value of (4)>Is time period->Based on the water quantity>For a time period of the last day>Based on the water quantity>Is the time difference of the time period>Is a period of timeThe resident water meter reading at the last time node of (4), (4)>Is a time period>Is taken into the previous time period of (4), (v) is taken>Is a time period>The reading of the resident water meter at the last time node of the last time period;
dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on the obtained resident water consumption data;
dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on the existing data specifically comprises the following steps:
the interval of the water consumption of residents is divided into 5 parts, the 5 grades of the water consumption of residents are respectively corresponding, the probability of each grade is 0.2, and the probability density function formula of normal distribution is as follows:
wherein the content of the first and second substances,based on the average water consumption of residents>Is the total number of samples in the sample set D, < > is >>Variance of water consumption for residents>Probability density function for water consumption of a resident>Is a sample->The amount of water used;
the grade probability of each resident water consumption is 0.2, namely the integral of the probability density function of the resident water consumption in each grade interval is 0.2, and the integral calculation formula is as follows:
wherein the content of the first and second substances,represents the upper and lower limits of the interval; />Indicates a section->The probability of (d); />
The integral according to the first class interval is equal to 0.2, i.e.The integral of the probability density function of time is 0.2, can obtain the grade node of the water consumption of the first resident>And so on to find another 3 rank nodes->The values of (1), namely 5 grade intervals of the water consumption of residents are respectively as follows: />、、/>、/>、/>;
Training according to a decision tree algorithm to obtain a respective water consumption prediction model of each household resident;
predicting the water consumption of residents based on a resident water prediction model;
if the predicted value of the water consumption of the residents and the actual value of the water consumption of the residents do not belong to the same water consumption grade, correcting the prediction model of the water consumption of the residents; the model for predicting the water consumption of each household resident trained according to the decision tree algorithm comprises the following steps:
water time period t, average temperature temp, weather, season,The date d and the water slope of the same time period of the previous day are respectively used as attributes of the decision tree, the splitting attribute of the current decision tree node is determined by respectively calculating the splitting information, the splitting information gain and the splitting information gain rate of each attribute, and the average temperature is usedFor example, the calculation formula is as follows:
wherein the content of the first and second substances,is attribute->Set of training data->Divide into->Sub-data set, <' > or>Is the first->Number of samples in the subset; />For pre-split data sets>Is greater than or equal to>Represents an attribute pick>The splitting information of (a);
wherein the content of the first and second substances,water level for the class label, i.e. the resident, based on>For the classification label->In the training data set->Is present at a frequency of->The number of the classification labels is constant 5, namely 5 water consumption levels of residents; />Set for training samples>Is entropy of the information of (4), is greater than or equal to>Is attribute->The entropy of the information after the split is,is attribute->Split training sample set->The latter information gain;
wherein the content of the first and second substances,as an attribute>Splitting training data set->The latter information gain rate, for the attribute->And attribute->An attribute @>And attribute->The attribute is obtained after the calculation>And attribute->And attribute->And attribute->Information gain of, i.e. < >>
Selecting the attribute with the maximum information gain rate as the splitting attribute of the node of the current decision tree, and gradually selecting the splitting attribute of the node of the next level of the decision tree by a recursive method until the construction of a decision tree model is completed, namely, generating an initial residential water consumption decision tree prediction model;
solving the overfitting problem of a decision tree algorithm by a PEP pruning method, firstly, the misjudgment rate of a subtree to be pruned needs to be obtained, and the specific calculation formula is as follows:
wherein the content of the first and second substances,is an adjustable parameter as a penalty factor>Is the number of leaf nodes of a subtree>Number of misjudged samples for a leaf node of a subtree>The number of samples for a leaf node;
the mean value and standard deviation of the times of misjudgment of the subtree are as follows:
when the subtree is replaced by the leaf node, that is, all the leaf nodes of the subtree are merged into the current node and taken as the leaf node, the misjudgment rate of the leaf node is as follows:
the mean value of the misjudgment times of the leaf nodes is as follows:
the pruning conditions are as follows:
and when the pruning condition is met, executing pruning operation, combining all leaf nodes of the subtrees as leaf nodes to replace the subtrees, and performing pruning calculation to obtain a complete residential water consumption prediction model.
2. The big data based secondary water supply amount prediction method according to claim 1, wherein the predicting the residential water usage amount based on the residential water usage prediction model comprises:
time period of water consumptionBased on the mean temperature>And weather->Season, season>And date->And the water slope is used for the same time period on the previous day>The water consumption is input into a resident water consumption prediction model as an input parameter, and the water consumption of residents of each household in each time period in one day is predicted;
accumulating the water consumption predicted values of all residents in all time periods;
and outputting the daily water consumption predicted value of the secondary water supply water point.
3. The big data-based secondary water supply prediction method according to claim 1, wherein if the predicted residential water consumption value and the actual residential water consumption value do not belong to the same water consumption class, the modifying the residential water prediction model specifically comprises:
obtaining the real water consumption value of single household resident in one dayThe actual water consumption value matrix of the single-family residents in one day is as follows:
is a single household resident>The real value of the water consumption of the residents in the time period is judged and judged>Is a single household resident>The real value of the water consumption of the residents in the time period is judged and judged>Is a single household resident>The real value of the water consumption of the residents in the time period is judged and judged>Is a single household resident>The real value of the water consumption of residents in the time period is used for predicting the water consumption of residentsAnd the real value of the water consumption of the residents>Marking the water consumption grade, comparing whether the water consumption of the time periods corresponding to the predicted value and the actual value belongs to the same water consumption grade, and when the predicted value and the actual value of all the corresponding time periods belong to the same water consumption grade, the resident water consumption prediction model already reaches the use standard;
and if the use standard is not met, re-establishing the residential water prediction model.
4. A computer device comprising a processor and a memory for storing processor-executable programs, the computer device performing the method of any of claims 1 to 3 when the processor executes the programs stored in the memory.
5. A storage medium, characterized by storing a program which, when executed by a processor, performs the method of any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211373554.7A CN115423224B (en) | 2022-11-04 | 2022-11-04 | Secondary water supply amount prediction method and device based on big data and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211373554.7A CN115423224B (en) | 2022-11-04 | 2022-11-04 | Secondary water supply amount prediction method and device based on big data and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115423224A CN115423224A (en) | 2022-12-02 |
CN115423224B true CN115423224B (en) | 2023-04-18 |
Family
ID=84208181
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211373554.7A Active CN115423224B (en) | 2022-11-04 | 2022-11-04 | Secondary water supply amount prediction method and device based on big data and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115423224B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116432863A (en) * | 2023-05-18 | 2023-07-14 | 安徽舜禹水务股份有限公司 | Integral peak-shifting scheduling method for secondary water supply based on mathematical programming |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112036668A (en) * | 2020-09-30 | 2020-12-04 | 北京百度网讯科技有限公司 | Water consumption prediction method, device, electronic equipment and computer readable medium |
CN114792169A (en) * | 2022-05-13 | 2022-07-26 | 遥相科技发展(北京)有限公司 | Residential water consumption prediction method based on MIC-XGboost algorithm |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110895969B (en) * | 2018-09-13 | 2023-12-15 | 大连大学 | Atrial fibrillation prediction decision tree and pruning method thereof |
CN110674985A (en) * | 2019-09-20 | 2020-01-10 | 北京建筑大学 | Urban resident domestic water consumption prediction method and application thereof |
CN114462550A (en) * | 2022-02-28 | 2022-05-10 | 天津大学 | Method for predicting water consumption of water supply network users by using random forest and probability density function |
-
2022
- 2022-11-04 CN CN202211373554.7A patent/CN115423224B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112036668A (en) * | 2020-09-30 | 2020-12-04 | 北京百度网讯科技有限公司 | Water consumption prediction method, device, electronic equipment and computer readable medium |
CN114792169A (en) * | 2022-05-13 | 2022-07-26 | 遥相科技发展(北京)有限公司 | Residential water consumption prediction method based on MIC-XGboost algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN115423224A (en) | 2022-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112765808B (en) | Ecological drought monitoring and evaluating method | |
Mujumdar et al. | Real‐time reservoir operation for irrigation | |
CN115423224B (en) | Secondary water supply amount prediction method and device based on big data and storage medium | |
CN105005204A (en) | Intelligent engine system capable of automatically triggering intelligent home and intelligent life scenes and method | |
CN110658725B (en) | Energy supervision and prediction system and method based on artificial intelligence | |
CN115411730B (en) | Air conditioner load multi-period adjustable potential evaluation method and related device | |
CN112470888A (en) | Automatic watering method and system for smart community | |
CN108452597A (en) | Screen replacing based reminding method and system, the air filter unit of air filter unit | |
CN116562583A (en) | Multidimensional water resource supply and demand prediction method and system | |
CN109299853B (en) | Reservoir dispatching function extraction method based on joint probability distribution | |
CN114092776A (en) | Multi-sensor data fusion method applied to intelligent agriculture | |
CN115879750B (en) | Aquatic seedling environment monitoring management system and method | |
CN117267905A (en) | Air conditioner control method and device, air conditioner and storage medium | |
Sarmas et al. | Baseline energy modeling for improved measurement and verification through the use of ensemble artificial intelligence models | |
CN116298576A (en) | Time-segmentation-considered extensible non-invasive load monitoring method | |
CN115587661A (en) | Livestock and poultry farm air quality optimization system and method based on field boundary indexes | |
CN108629362A (en) | A kind of learning behavior custom discovery quantization system and method towards mobile environment | |
CN114330136A (en) | Water meter based water living condition monitoring method, system, device and storage medium | |
Khan et al. | Irrigation water requirement prediction through various data mining techniques applied on a carefully pre-processed dataset | |
Rastog et al. | Crop and Yield Prediction Through Machine Learning Techniques to Maximize Production: 21st Century Sustainable Approach for Smart Cities 5.0 | |
CN117371591A (en) | Power consumer level cooling load identification method | |
CN109963262A (en) | Wireless sensor method for optimizing scheduling in a kind of wireless sensor network | |
CN113505913B (en) | Reservoir optimal scheduling decision method and device for stability of aquatic community system | |
CN117933946B (en) | Rural business management method based on big data | |
Liu et al. | The research of precision irrigation decision support system based on genetic algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |