CN115423224B - Secondary water supply amount prediction method and device based on big data and storage medium - Google Patents

Secondary water supply amount prediction method and device based on big data and storage medium Download PDF

Info

Publication number
CN115423224B
CN115423224B CN202211373554.7A CN202211373554A CN115423224B CN 115423224 B CN115423224 B CN 115423224B CN 202211373554 A CN202211373554 A CN 202211373554A CN 115423224 B CN115423224 B CN 115423224B
Authority
CN
China
Prior art keywords
water consumption
water
residents
resident
time period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211373554.7A
Other languages
Chinese (zh)
Other versions
CN115423224A (en
Inventor
潘力群
朱少华
姜春涛
李有毅
邹振东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan Electronic Government Science And Technology Co ltd
Original Assignee
Foshan Electronic Government Science And Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan Electronic Government Science And Technology Co ltd filed Critical Foshan Electronic Government Science And Technology Co ltd
Priority to CN202211373554.7A priority Critical patent/CN115423224B/en
Publication of CN115423224A publication Critical patent/CN115423224A/en
Application granted granted Critical
Publication of CN115423224B publication Critical patent/CN115423224B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a secondary water supply amount prediction method based on big data, equipment and a storage medium, and belongs to the technical field of big data prediction algorithms. The method comprises the following steps: establishing a single-family resident water consumption prediction model; the method specifically comprises the following steps: obtaining water consumption of single household residents generating a data set by the data; dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on all the acquired resident water consumption data; training according to a decision tree algorithm to obtain a single-family resident water consumption prediction model; predicting the water consumption of single-user residents based on the single-user resident water consumption prediction model, and calculating the water consumption prediction value of the secondary water supply water consumption point according to the water consumption of the single-user residents; and if the predicted water consumption value of the single-user residents and the actual water consumption value of the single-user residents do not belong to the same water consumption grade, correcting the single-user resident water consumption prediction model. The method can solve the problems of simple model input parameters and lack of a feedback optimization mechanism in the conventional secondary water supply amount prediction model.

Description

Secondary water supply amount prediction method and device based on big data and storage medium
Technical Field
The invention relates to the technical field of big data prediction algorithms, in particular to a big data-based secondary water supply amount prediction method, equipment and a storage medium.
Background
The secondary water supply is a form that a unit or an individual stores and pressurizes urban public water supply or self-built facility water supply, and supplies water to users or self-use through a pipeline. The secondary water supply is mainly established for compensating the pressure shortage of the municipal water supply pipeline and better ensuring the water consumption of the people living in the dwelling and living higher. The water quality, water pressure and water supply safety of secondary water supply are closely related to the normal and stable life of people.
The main forms of the secondary water supply are 2, the first form is the secondary water supply without an underground water tank and a water pump for pressurization, such as a roof water tank, and the form is suitable for low-rise buildings; the second type is a secondary water supply with an underground water tank and a water pump for pressurization, for example, the secondary water supply is pressurized and then passes through a roof water tank, an air pressure sight and a variable-frequency speed-regulating water pump, namely, a 'water tank-water pump-water tank' combined water supply mode, and the secondary water supply in the form is more widely used at present.
Compared with raw water supply, secondary water supply is easier to infect water quality, and mainly causes the turbidity of water in a pool, a water tank and peripheral water to be increased, the content of harmful substances to human bodies such as bacteria, colibacillus, iron, manganese, carbon tetrachloride and nitrite in the water is increased due to poor management, overlong water storage time and the like in the water storage process of the pool or the water tank. Therefore, the water supply amount of the secondary water supply needs to be strictly controlled, so that the phenomenon of insufficient water pressure due to insufficient secondary water supply amount cannot occur, and the phenomenon of water pollution caused by the increase of water age due to overlong water storage time due to excessive secondary water supply amount cannot occur. Meanwhile, the health management and supervision are required to be enhanced, and the water is cleaned and disinfected regularly to ensure the drinking water health of residents.
Disclosure of Invention
The invention aims to provide a secondary water supply amount prediction method, equipment and a storage medium based on big data, which can solve the problems of simple model input parameters and lack of a feedback optimization mechanism in the secondary water supply demand amount prediction technology of the existing water consumption site, and ensure that the water consumption of residents is predicted more accurately and meticulously.
The invention provides a secondary water supply amount prediction method based on big data, which comprises the following steps:
establishing a single-family resident water consumption prediction model; the method specifically comprises the following steps:
acquiring water consumption data of single-family residents to generate a data set;
dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on all the acquired resident water consumption data;
training according to a decision tree algorithm to obtain a single-family resident water consumption prediction model;
predicting the water consumption of single-user residents based on the single-user resident water consumption prediction model, and calculating according to the water consumption of the single-user residents to obtain the water consumption of all residents, namely the water consumption prediction value of the secondary water supply water consumption point;
and if the predicted water consumption value of the single-user residents and the actual water consumption value of the single-user residents do not belong to the same water consumption grade, correcting the single-user resident water consumption prediction model.
Preferably, the acquiring the water consumption data of the single household to generate the data set specifically includes:
the data to be collected includes: time period
Figure 25084DEST_PATH_IMAGE001
Based on the mean temperature>
Figure 873543DEST_PATH_IMAGE002
And weather->
Figure 757417DEST_PATH_IMAGE002
Season, season>
Figure 607633DEST_PATH_IMAGE004
And date->
Figure 634495DEST_PATH_IMAGE005
And the water slope is used for the same time period on the previous day>
Figure 960434DEST_PATH_IMAGE006
Based on the water quantity in the current time period>
Figure 576092DEST_PATH_IMAGE007
The time of day is divided into 4 time periods, each of which has 6 hours, namely (00;
the water consumption of each single household in each time period is calculated by the formula, and the formula is as follows:
Figure 962074DEST_PATH_IMAGE008
Figure 210653DEST_PATH_IMAGE009
wherein the content of the first and second substances,
Figure 589550DEST_PATH_IMAGE010
for a time period of the last day>
Figure 872764DEST_PATH_IMAGE001
Is measured by the water usage slope value of (4)>
Figure 164068DEST_PATH_IMAGE011
Is time period->
Figure 899943DEST_PATH_IMAGE001
Based on the water quantity>
Figure 816952DEST_PATH_IMAGE012
Is based on the time period of the previous day>
Figure 157935DEST_PATH_IMAGE001
The amount of water to be used is,
Figure 620140DEST_PATH_IMAGE013
is a time difference of a time period>
Figure 358158DEST_PATH_IMAGE014
Is time period->
Figure 564011DEST_PATH_IMAGE001
The water meter reading of the single-family resident at the last time node of (4) is recorded in the recording device>
Figure 556238DEST_PATH_IMAGE015
Is time period->
Figure 454924DEST_PATH_IMAGE001
Is taken into the previous time period of (4), (v) is taken>
Figure 414659DEST_PATH_IMAGE016
Is time period->
Figure 361886DEST_PATH_IMAGE001
The water meter reading of the single household resident at the last time node of the last time period.
Preferably, the dividing the total domestic water consumption into 5 grades according to the normal distribution and performing grade marking on the obtained total domestic water consumption data comprises:
dividing all the resident water consumption into 5 intervals, wherein the intervals respectively correspond to 5 grades of all the resident water consumption, the probability of each grade is 0.2, and the probability density function formula of normal distribution is as follows:
Figure 474199DEST_PATH_IMAGE017
Figure 527474DEST_PATH_IMAGE018
Figure 725237DEST_PATH_IMAGE019
wherein, the first and the second end of the pipe are connected with each other,
Figure 741735DEST_PATH_IMAGE020
mean water consumption for all residents>
Figure 957821DEST_PATH_IMAGE021
For the total number of samples in the data set D>
Figure 463889DEST_PATH_IMAGE022
Variance of water usage for all residents, <' > based on the total water usage>
Figure 883369DEST_PATH_IMAGE023
Is a sample>
Figure 234716DEST_PATH_IMAGE024
E is the natural logarithm>
Figure 508571DEST_PATH_IMAGE025
For all residents, a water consumption probability density function, wherein>
Figure 919961DEST_PATH_IMAGE026
Water quantity for the sample, based on the measured value>
Figure 826737DEST_PATH_IMAGE027
The grade probability of all the resident water consumption is 0.2, namely the integral of the probability density function of all the resident water consumption in each grade interval is 0.2, and the integral calculation formula is as follows:
Figure 965463DEST_PATH_IMAGE028
wherein, the first and the second end of the pipe are connected with each other,
Figure 906875DEST_PATH_IMAGE029
represents the upper and lower limits of the interval; />
Figure 489166DEST_PATH_IMAGE030
Indicates a section->
Figure 148817DEST_PATH_IMAGE031
The probability of (d);
integration according to the first grade interval equals 0.2, i.e.
Figure 91234DEST_PATH_IMAGE032
The integral of the probability density function is 0.2, and the first grade node for the water consumption of all residents can be obtained>
Figure 824835DEST_PATH_IMAGE033
And so on to obtain grade nodes
Figure 578027DEST_PATH_IMAGE034
The value of (2) can be obtained, and 5 grade intervals of all the water consumption of residents can be obtainedAnd 5 grade intervals of all the water consumption of residents are respectively as follows: />
Figure 974243DEST_PATH_IMAGE035
、/>
Figure 674345DEST_PATH_IMAGE036
、/>
Figure 324769DEST_PATH_IMAGE037
、/>
Figure 498131DEST_PATH_IMAGE038
Figure 866795DEST_PATH_IMAGE039
Preferably, the training according to the decision tree algorithm to obtain the single-family residential water prediction model comprises:
the time period t, the average temperature temp, the weather, the season, the date d, the same time period of the previous day and the water slope are respectively used as attributes of the decision tree, the splitting attribute of the current decision tree node is determined according to the splitting information, the splitting information gain and the splitting information gain rate of each attribute, and the splitting information, the splitting information gain and the splitting information gain rate of each attribute are respectively calculated according to the following calculation formula:
Figure 901747DEST_PATH_IMAGE040
wherein the content of the first and second substances,
Figure 124787DEST_PATH_IMAGE041
is attribute->
Figure 219782DEST_PATH_IMAGE042
Set of training data->
Figure 75743DEST_PATH_IMAGE043
Division into->
Figure 366915DEST_PATH_IMAGE041
Sub-data set, <' > or>
Figure 726353DEST_PATH_IMAGE044
Is the first->
Figure 992249DEST_PATH_IMAGE045
Number of samples in the subset; />
Figure 335506DEST_PATH_IMAGE046
For pre-division training data set->
Figure 227107DEST_PATH_IMAGE043
Is greater than or equal to>
Figure 909892DEST_PATH_IMAGE047
Represents an attribute pick>
Figure 612269DEST_PATH_IMAGE042
The splitting information of (a);
Figure 426510DEST_PATH_IMAGE048
Figure 606956DEST_PATH_IMAGE049
Figure 144247DEST_PATH_IMAGE050
wherein the content of the first and second substances,
Figure 266793DEST_PATH_IMAGE051
water level for all residents for the classification label, based on the water level>
Figure 522325DEST_PATH_IMAGE052
For a classification label>
Figure 240882DEST_PATH_IMAGE051
In the training data set
Figure 944265DEST_PATH_IMAGE043
In (d) is present in>
Figure 722865DEST_PATH_IMAGE053
For the number of classification tags, is>
Figure 262431DEST_PATH_IMAGE053
The water consumption of 5 residents is constant, namely 5 water consumption grades;
Figure 784679DEST_PATH_IMAGE054
for a training data set->
Figure 342568DEST_PATH_IMAGE043
In the entropy of the information in (b), in combination with>
Figure 557649DEST_PATH_IMAGE055
Is attribute->
Figure 53352DEST_PATH_IMAGE042
The entropy of the information after the split is,
Figure 644871DEST_PATH_IMAGE056
is attribute->
Figure 57266DEST_PATH_IMAGE042
Splitting training data set->
Figure 177669DEST_PATH_IMAGE043
The latter information gain;
Figure 691827DEST_PATH_IMAGE057
wherein, the first and the second end of the pipe are connected with each other,
Figure 8408DEST_PATH_IMAGE058
is attribute->
Figure 26042DEST_PATH_IMAGE042
Splitting training data set->
Figure 848505DEST_PATH_IMAGE043
Later information gain rate, attribute>
Figure 584380DEST_PATH_IMAGE042
Comprising an attribute->
Figure 501389DEST_PATH_IMAGE002
An attribute @>
Figure 373530DEST_PATH_IMAGE059
An attribute @>
Figure 101315DEST_PATH_IMAGE004
And attribute->
Figure 324486DEST_PATH_IMAGE005
And attribute->
Figure 779607DEST_PATH_IMAGE006
For attributes
Figure 771833DEST_PATH_IMAGE002
And attribute->
Figure 670519DEST_PATH_IMAGE059
And attribute->
Figure 115407DEST_PATH_IMAGE004
And attribute->
Figure 374219DEST_PATH_IMAGE005
And attribute->
Figure 486531DEST_PATH_IMAGE006
The attributes are obtained through the calculation in sequence
Figure 290539DEST_PATH_IMAGE002
An attribute @>
Figure 488303DEST_PATH_IMAGE059
And attribute->
Figure 285226DEST_PATH_IMAGE004
And attribute->
Figure 455307DEST_PATH_IMAGE005
An attribute @>
Figure 430217DEST_PATH_IMAGE006
The information gain rate of (d);
selecting the attribute with the maximum information gain rate as the splitting attribute of the node of the current decision tree, and gradually selecting the node splitting attribute of the next level of the decision tree by a recursive method until the construction of a decision tree model is completed, namely generating an initial single-family resident water prediction model;
solving the overfitting problem of the decision tree algorithm by a PEP pruning method, firstly, the misjudgment rate of the subtree to be pruned needs to be obtained
Figure 364543DEST_PATH_IMAGE060
1, the specific calculation formula is as follows:
Figure 715890DEST_PATH_IMAGE061
wherein, the first and the second end of the pipe are connected with each other,
Figure 802795DEST_PATH_IMAGE062
is a penalty factor, is an adjustable parameter>
Figure 214185DEST_PATH_IMAGE063
Is a leaf node of a subtree>
Figure 370229DEST_PATH_IMAGE064
Is the number of leaf nodes of a subtree>
Figure 259687DEST_PATH_IMAGE065
The number of misjudged samples for a leaf node of a subtree>
Figure 404361DEST_PATH_IMAGE066
The number of samples for a leaf node;
mean value of the misjudgment times of the subtrees
Figure 235919DEST_PATH_IMAGE067
And standard deviation->
Figure 833254DEST_PATH_IMAGE068
Comprises the following steps:
Figure 791983DEST_PATH_IMAGE069
Figure 306009DEST_PATH_IMAGE070
;/>
assuming that a sub-tree is replaced by a leaf node, namely, the root node of the sub-tree is used as a leaf node, and the rest nodes of the sub-tree are deleted, the misjudgment rate of the leaf node
Figure 262464DEST_PATH_IMAGE071
Comprises the following steps:
Figure 658679DEST_PATH_IMAGE072
mean value of misjudgment times of leaf node
Figure 889941DEST_PATH_IMAGE073
Comprises the following steps:
Figure 743627DEST_PATH_IMAGE074
the pruning conditions are as follows:
Figure 916988DEST_PATH_IMAGE075
and when the pruning condition is met, executing pruning operation, taking the root node of the subtree as a leaf node, deleting the other nodes of the subtree, and performing pruning calculation to obtain a complete single-family resident water use prediction model.
Preferably, the water consumption of the single-user residents is predicted based on the single-user resident water prediction model, and the water consumption of all the residents, namely the water consumption prediction value of the secondary water supply water consumption point, is obtained through calculation according to the water consumption of the single-user residents; the method comprises the following steps:
time period
Figure 285653DEST_PATH_IMAGE001
Based on the mean temperature>
Figure 320605DEST_PATH_IMAGE002
And weather->
Figure 825535DEST_PATH_IMAGE059
Season, season>
Figure 5736DEST_PATH_IMAGE004
And date->
Figure 861696DEST_PATH_IMAGE005
And the water slope is used for the same time period on the previous day>
Figure 434760DEST_PATH_IMAGE006
The water consumption is input into a single-household resident water consumption prediction model as an input parameter, and the water consumption of a single-household resident in 4 time periods in one day is predicted;
and accumulating the water consumption of all residents in all time periods to obtain the water consumption of all residents, namely the daily water consumption prediction value of the secondary water supply water consumption point.
Preferably, if the predicted value of the water consumption of the single-user residents and the actual value of the water consumption of the single-user residents do not belong to the same water consumption level, the modifying the prediction model of the water consumption of the single-user residents comprises the following steps:
obtaining water consumption of single household resident in one dayTrue value of quantity
Figure 59776DEST_PATH_IMAGE076
The actual water consumption value matrix of the single-family residents in one day is as follows:
Figure 840519DEST_PATH_IMAGE077
Figure 121459DEST_PATH_IMAGE078
is a single household resident>
Figure 481902DEST_PATH_IMAGE079
The real water consumption value of a single household resident in a time period is judged and judged>
Figure 227004DEST_PATH_IMAGE080
Is a single household residents>
Figure 867064DEST_PATH_IMAGE081
The real water consumption value of a single household resident in a time period is judged and judged>
Figure 946884DEST_PATH_IMAGE082
Is a single household resident>
Figure 861751DEST_PATH_IMAGE083
True water consumption value for a single household resident in a time period, based on>
Figure 664622DEST_PATH_IMAGE084
For a single household resident>
Figure 990430DEST_PATH_IMAGE085
The real water consumption value of a single household resident in a time period is the predicted value of the water consumption of the single household resident in one day>
Figure 308279DEST_PATH_IMAGE086
And the real water consumption value of single household resident in one day
Figure 744945DEST_PATH_IMAGE076
Marking the water consumption grade, comparing whether the water consumption in the time periods corresponding to the predicted value and the actual value belongs to the same water consumption grade, and when the predicted value and the actual value of all the corresponding time periods belong to the same water consumption grade, the single-user resident water prediction model reaches the use standard;
and if the use standard is not met, the step of establishing the single-family resident water use prediction model is executed again.
The second purpose of the invention can be achieved by adopting the following technical scheme:
a computer device comprising a processor and a memory for storing a processor executable program, the processor implementing the above-described big data based secondary water supply amount prediction method when executing the program stored in the memory.
The third purpose of the invention can be achieved by adopting the following technical scheme:
a storage medium storing a program which, when executed by a processor, implements the above-described secondary water supply amount prediction method based on big data.
Compared with the prior art for predicting the daily water consumption of the whole secondary water supply water consumption point, the method for predicting the daily water consumption of the secondary water supply water consumption point predicts the water consumption of each household from the bottommost layer, and stored data are more detailed and can cope with more complex scenes. Compared with the prior art that only one-time water use prediction model item is trained, the method adopts a feedback optimization mechanism of random spot check to periodically check the accuracy and the robustness of the model, and simultaneously feeds back the check result to a secondary water supply control center to perform iterative optimization updating on the existing water use prediction model.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
FIG. 1 is a flow chart of a big data based secondary water supply prediction method according to the present invention;
FIG. 2 is a schematic diagram of a secondary water supply system according to the present invention;
FIG. 3 is a diagram illustrating a four-dimensional structure of a data set D according to a big data-based method for predicting secondary water supply according to the present invention;
fig. 4 is a flow chart of a method for predicting secondary water supply based on big data according to the present invention, and the method establishes a prediction model for water consumption of single household.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that all directional indicators (such as up, down, left, right, front, back ...) in the embodiments of the present invention are only used to explain the relative positional relationship between the components, the motion situation, etc. in a specific posture (as shown in the attached drawings), and if the specific posture is changed, the directional indicator is changed accordingly.
In addition, the descriptions related to "first", "second", etc. in the present invention are for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
The invention discloses a secondary water supply amount prediction method based on big data, which comprises the following steps with reference to figures 1-4:
step 100, establishing a single-household resident water consumption prediction model; the method specifically comprises the following steps:
step 101, acquiring water consumption data of single household to generate a data set;
step 102, dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on all the acquired resident water consumption data;
103, training according to a decision tree algorithm to obtain a single-family resident water consumption prediction model;
step 200, predicting the water consumption of single-user residents based on a single-user resident water prediction model, and calculating according to the water consumption of the single-user residents to obtain the total water consumption of the residents, namely the water consumption prediction value of a secondary water supply water consumption point;
and step 300, if the predicted water consumption value of the single-user residents and the actual water consumption value of the single-user residents do not belong to the same water consumption grade, correcting the single-user resident water consumption prediction model.
Compared with the prior art for predicting the daily water consumption of the whole secondary water supply water consumption point, the method for predicting the daily water consumption of the secondary water supply water consumption point predicts the water consumption of each household from the bottommost layer, and stored data are more detailed and can cope with more complex scenes. Compared with the prior art that only one-time water use prediction model item is trained, the method adopts a feedback optimization mechanism of random spot check to periodically check the accuracy and the robustness of the model, and simultaneously feeds back the check result to a secondary water supply control center to perform iterative optimization updating on the existing water use prediction model. When the predicted water consumption is small, the water supply amount is reduced, so that the water consumption of a user is further reduced, the water consumption under the condition can be used as an index to cause the prediction model to fall into a cycle with the continuously reduced water supply amount, and the problem that the prediction model falls into a vicious cycle is solved by using the local minimum average water consumption as the water level minimum threshold value.
The secondary water supply system structure of the invention is shown in figure 2, which comprises 4 nodes of a secondary water supply point, a secondary water supply water consumption control center, a secondary water supply water consumption point and a water using resident, wherein the secondary water supply point and the secondary water supply water consumption control center are positioned at the same physical position; the secondary water supply point is mainly responsible for water supply tasks and can adjust the water supply amount through an automatic control valve; the secondary water supply control center is used as an intelligent brain of the water supply system, is mainly responsible for the functions of information interaction, data storage and algorithm calculation, and intelligently controls the automatic control valve to adjust the water supply amount; the secondary water supply water consumption point provides secondary water supply for residents using water, such as communities, gardens and the like, and water level data can be sent to a secondary water supply water consumption control center through a water level sensor; the water consumption residents automatically recognize the reading of the water meter through the water meter reading robot and send the reading data to the secondary water supply water control center.
Step 101, acquiring single-family resident water consumption data to generate a data set, specifically comprising:
the data to be collected includes: time period
Figure 199060DEST_PATH_IMAGE087
Based on the mean temperature>
Figure 180923DEST_PATH_IMAGE088
And weather->
Figure 149301DEST_PATH_IMAGE003
Season, season>
Figure 757583DEST_PATH_IMAGE090
Date, date
Figure 269467DEST_PATH_IMAGE091
And the water slope is used for the same time period on the previous day>
Figure 468236DEST_PATH_IMAGE092
Based on the water quantity in the current time period>
Figure 495098DEST_PATH_IMAGE093
The time of day is divided into 4 time periods, each of which has 6 hours, namely (00;
the water consumption of single-family residents in each time period is calculated through a formula, wherein the formula is as follows:
Figure 821037DEST_PATH_IMAGE094
Figure 702275DEST_PATH_IMAGE095
wherein the content of the first and second substances,
Figure 25940DEST_PATH_IMAGE096
is based on the time period of the previous day>
Figure 540098DEST_PATH_IMAGE087
Is measured by the water usage slope value of (4)>
Figure 653416DEST_PATH_IMAGE097
Is time period->
Figure 936630DEST_PATH_IMAGE087
Based on the water quantity>
Figure 493513DEST_PATH_IMAGE098
Is based on the time period of the previous day>
Figure 681918DEST_PATH_IMAGE087
The amount of water to be used is,
Figure 84080DEST_PATH_IMAGE099
is a time difference of a time period>
Figure 487380DEST_PATH_IMAGE100
Is a time period>
Figure 215164DEST_PATH_IMAGE087
The water meter reading of the single household resident at the last time node is judged and judged>
Figure 625286DEST_PATH_IMAGE101
Is time period->
Figure 831139DEST_PATH_IMAGE087
Is taken into the previous time period of (4), (v) is taken>
Figure 541475DEST_PATH_IMAGE102
Is time period->
Figure 174582DEST_PATH_IMAGE087
The water meter reading of the single household resident at the last time node of the last time period.
After one week of the raw data acquisition and accumulation work, a data set D is obtained, and FIG. 3 is a four-dimensional structure example diagram of the data set D, wherein
Figure 885049DEST_PATH_IMAGE103
Respectively represent time periods (00]、(06:00,12:00]、(12:00,18:00]、(18:00,24:00];/>
Figure 143861DEST_PATH_IMAGE104
Is time period->
Figure 193856DEST_PATH_IMAGE105
Is taken on average>
Figure 529023DEST_PATH_IMAGE106
Is time period->
Figure 710474DEST_PATH_IMAGE105
Based on the weather condition of (4), is greater than or equal to>
Figure DEST_PATH_IMAGE107
Is time period->
Figure 992551DEST_PATH_IMAGE105
In the season of (4)>
Figure DEST_PATH_IMAGE108
Is time period->
Figure 677479DEST_PATH_IMAGE105
The date of the day,
Figure DEST_PATH_IMAGE109
is a time period>
Figure 324492DEST_PATH_IMAGE105
On the same day with the water slope, and/or>
Figure DEST_PATH_IMAGE110
Is time period->
Figure 993240DEST_PATH_IMAGE105
The water consumption of single-family residents; each parameter matrix of a single time period of the single-family residents is a one-dimensional matrix, each parameter matrix of 4 time periods of a day of the single-family residents is a two-dimensional matrix, each parameter matrix of 4 time periods of a day of all users in an area is a three-dimensional matrix, and each parameter matrix of 4 time periods of all users in the area in a period is a four-dimensional matrix. And (3) taking the parameter time period, the average temperature, the weather, the season, the date and the water use slope of the same time period in the last day as input, taking the water use of residents in the current time period as a prediction result, and training a secondary water supply water use prediction model.
Step 102, dividing all the resident water consumption into 5 grades according to normal distribution, and performing grade marking on all the acquired resident water consumption data, which specifically comprises the following steps:
normal distribution is a very common continuous probability distribution, and when a research object has a reference and fluctuates up and down on the reference with a certain amplitude, so that the characteristics of dense middle and sparse sides are formed, the research object meets the normal distribution. According to the living water habit of residents, the living water habit is basically maintained on a stable reference point, and the living water difference of residents can occur along with factors such as home time, diet difference and the like, so that the characteristic of normal distribution is obviously satisfied.
Dividing all the resident water consumption into 5 intervals, respectively corresponding to 5 grades of all the resident water consumption, wherein the probability of each grade is 0.2, and the probability density function formula of normal distribution is as follows:
Figure DEST_PATH_IMAGE111
Figure DEST_PATH_IMAGE112
Figure DEST_PATH_IMAGE113
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE114
mean water consumption for all residents>
Figure DEST_PATH_IMAGE115
Is the total number of samples in data set D, <' >>
Figure DEST_PATH_IMAGE116
Variance of water consumption for all residents, <' > based on the total water consumption>
Figure DEST_PATH_IMAGE117
Is a sample->
Figure DEST_PATH_IMAGE118
E is the natural logarithm>
Figure DEST_PATH_IMAGE119
For all residents, a water consumption probability density function, wherein>
Figure DEST_PATH_IMAGE120
Water quantity for the sample, based on the measured value>
Figure DEST_PATH_IMAGE121
The grade probability of all the water consumption of the residents is 0.2, namely the integral of the probability density function of all the water consumption of the residents in each grade interval is 0.2, and the integral calculation formula is as follows:
Figure DEST_PATH_IMAGE122
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE123
represents the upper and lower limits of the interval; />
Figure DEST_PATH_IMAGE124
Indicates a section->
Figure DEST_PATH_IMAGE125
The probability of (d);
the integral according to the first class interval is equal to 0.2, i.e.
Figure DEST_PATH_IMAGE126
The integral of the probability density function is 0.2, and the first grade node for the water consumption of all residents can be obtained>
Figure DEST_PATH_IMAGE127
And so on to find the rank node ^ er>
Figure DEST_PATH_IMAGE128
The value of (2) can be obtained 5 grade intervals of all the resident water consumption, and the 5 grade intervals of all the resident water consumption are respectively: />
Figure DEST_PATH_IMAGE129
、/>
Figure DEST_PATH_IMAGE130
、/>
Figure DEST_PATH_IMAGE131
Figure DEST_PATH_IMAGE132
、/>
Figure DEST_PATH_IMAGE133
Preferably, in step 103, the method for obtaining the single-family residential water prediction model by training according to the decision tree algorithm includes: the time period t, the average temperature temp, the weather, the season, the date d, the same time period of the previous day and the water slope are respectively used as attributes of the decision tree, the splitting attribute of the current decision tree node is determined according to the splitting information, the splitting information gain and the splitting information gain rate of each attribute, and the splitting information, the splitting information gain and the splitting information gain rate of each attribute are respectively calculated according to the following calculation formula:
Figure DEST_PATH_IMAGE134
;/>
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE135
as an attribute>
Figure DEST_PATH_IMAGE136
Set of training data->
Figure DEST_PATH_IMAGE137
Divide into->
Figure 810499DEST_PATH_IMAGE135
Sub-data set, <' > or>
Figure DEST_PATH_IMAGE138
Is the first->
Figure DEST_PATH_IMAGE139
Number of samples in the subset; />
Figure DEST_PATH_IMAGE140
For pre-division training data set->
Figure 208988DEST_PATH_IMAGE137
Is greater than or equal to>
Figure DEST_PATH_IMAGE141
Represents an attribute pick>
Figure 823640DEST_PATH_IMAGE136
The splitting information of (a);
Figure DEST_PATH_IMAGE142
Figure DEST_PATH_IMAGE143
Figure DEST_PATH_IMAGE144
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE145
water level for all residents for the classification label, based on the water level>
Figure DEST_PATH_IMAGE146
For the classification label->
Figure 697793DEST_PATH_IMAGE145
In the training data set
Figure 852831DEST_PATH_IMAGE137
Is present at a frequency of->
Figure DEST_PATH_IMAGE147
For the number of classification tags, is>
Figure 981193DEST_PATH_IMAGE147
Constant 5, i.e. 5 for all residentsWater quantity grade;
Figure DEST_PATH_IMAGE148
for a training data set->
Figure 484855DEST_PATH_IMAGE137
In the entropy of the information in (b), in combination with>
Figure DEST_PATH_IMAGE149
Is attribute->
Figure 82190DEST_PATH_IMAGE136
The entropy of the information after the split is,
Figure DEST_PATH_IMAGE150
is attribute->
Figure 985727DEST_PATH_IMAGE136
Splitting training data set->
Figure 234175DEST_PATH_IMAGE137
The latter information gain;
Figure DEST_PATH_IMAGE151
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE152
is attribute->
Figure 643160DEST_PATH_IMAGE136
Splitting training data set->
Figure 258949DEST_PATH_IMAGE137
Later information gain rate, attribute>
Figure 755789DEST_PATH_IMAGE136
Comprises that attribute->
Figure 124322DEST_PATH_IMAGE088
And attribute->
Figure DEST_PATH_IMAGE153
And attribute->
Figure 517258DEST_PATH_IMAGE090
And attribute->
Figure 885922DEST_PATH_IMAGE091
And attribute->
Figure 170142DEST_PATH_IMAGE092
For the attribute
Figure 675072DEST_PATH_IMAGE088
An attribute @>
Figure 770067DEST_PATH_IMAGE153
And attribute->
Figure 360449DEST_PATH_IMAGE090
And attribute->
Figure 448359DEST_PATH_IMAGE091
And attribute->
Figure 73376DEST_PATH_IMAGE092
The attributes are obtained through the calculation in sequence
Figure 339272DEST_PATH_IMAGE088
And attribute->
Figure 416949DEST_PATH_IMAGE153
And attribute->
Figure 42972DEST_PATH_IMAGE090
An attribute @>
Figure 522494DEST_PATH_IMAGE091
And attribute->
Figure 959292DEST_PATH_IMAGE092
The information gain rate of (d);
selecting the attribute with the maximum information gain rate as the splitting attribute of the node of the current decision tree, and gradually selecting the node splitting attribute of the next level of the decision tree by a recursive method until the construction of a decision tree model is completed, namely generating an initial single-family resident water prediction model;
solving the overfitting problem of the decision tree algorithm by a PEP pruning method, firstly, the misjudgment rate of the subtree to be pruned needs to be obtained
Figure DEST_PATH_IMAGE154
1 The specific calculation formula is as follows:
Figure DEST_PATH_IMAGE155
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE156
is an adjustable parameter as a penalty factor>
Figure DEST_PATH_IMAGE157
Is a leaf node of a subtree>
Figure DEST_PATH_IMAGE158
The number of the leaf nodes of the subtree, device for selecting or keeping>
Figure DEST_PATH_IMAGE159
The number of misjudged samples for a leaf node of a subtree>
Figure DEST_PATH_IMAGE160
The number of samples for a leaf node;
mean value of the misjudgment times of the subtrees
Figure DEST_PATH_IMAGE161
And standard deviation>
Figure DEST_PATH_IMAGE162
Comprises the following steps:
Figure DEST_PATH_IMAGE163
Figure DEST_PATH_IMAGE164
assuming that a sub-tree is replaced by a leaf node, namely, the root node of the sub-tree is used as a leaf node, and the rest nodes of the sub-tree are deleted, the misjudgment rate of the leaf node
Figure DEST_PATH_IMAGE165
Comprises the following steps: />
Figure DEST_PATH_IMAGE166
Mean value of misjudgment times of leaf node
Figure DEST_PATH_IMAGE167
Comprises the following steps:
Figure DEST_PATH_IMAGE168
the pruning conditions are as follows:
Figure DEST_PATH_IMAGE169
and when the pruning condition is met, executing pruning operation, taking the root node of the subtree as a leaf node, deleting the other nodes of the subtree, and performing pruning calculation to obtain a complete single-family resident water use prediction model.
Preferably, the step 200 of predicting the water consumption of the single-user residents based on the single-user residential water prediction model, and calculating the water consumption of all the residents according to the water consumption of the single-user residents to obtain the predicted water consumption value of the secondary water supply point, includes:
step 201, time period
Figure DEST_PATH_IMAGE170
Average temperature->
Figure DEST_PATH_IMAGE171
Weather>
Figure DEST_PATH_IMAGE172
Season, season>
Figure DEST_PATH_IMAGE173
Date/or>
Figure DEST_PATH_IMAGE174
And the water slope is used for the same time period on the previous day>
Figure DEST_PATH_IMAGE175
The water consumption is input into a single-household resident water consumption prediction model as an input parameter, and the water consumption of a single-household resident in 4 time periods in one day is predicted;
and step 202, accumulating the water consumption of all residents in all time periods to obtain the water consumption of all residents, namely the daily water consumption prediction value of the secondary water supply water consumption point.
Preferably, in step 300, if the predicted value of the water consumption of the individual household residents and the actual value of the water consumption of the individual household residents do not belong to the same water consumption level, the method for predicting the water consumption of the individual household residents by modifying the water consumption prediction model of the individual household residents specifically comprises the following steps:
301, acquiring the real water consumption value of a single household resident in one day
Figure DEST_PATH_IMAGE176
The actual water consumption value matrix of the single household resident in one day is as follows:
Figure DEST_PATH_IMAGE177
Figure DEST_PATH_IMAGE178
is a single household resident>
Figure 380390DEST_PATH_IMAGE105
The actual water consumption of single-family residents in the time period,
Figure DEST_PATH_IMAGE179
is a single household resident>
Figure DEST_PATH_IMAGE180
The real water consumption value of a single household resident in a time period is judged and judged>
Figure DEST_PATH_IMAGE181
For a single household resident>
Figure DEST_PATH_IMAGE182
Single family resident in time period the real value of the water consumption is obtained, device for selecting or keeping>
Figure DEST_PATH_IMAGE183
Is a single household resident>
Figure DEST_PATH_IMAGE184
The real water consumption value of a single household resident in a time period is the predicted value of the water consumption of the single household resident in one day>
Figure DEST_PATH_IMAGE185
And the real water consumption value in one day of the single household resident>
Figure 278945DEST_PATH_IMAGE076
Marking the water consumption grade, comparing whether the water consumption in the time periods corresponding to the predicted value and the actual value belongs to the same water consumption grade, and when the predicted value and the actual value of all the corresponding time periods belong to the same water consumption grade, the single-user resident water prediction model reaches the use standard;
and step 302, if the use standard is not met, re-establishing the single-household residential water use prediction model.
The invention can predict the water consumption of single household residents in one day
Figure DEST_PATH_IMAGE186
And the real water consumption value in one day of the single household resident>
Figure DEST_PATH_IMAGE187
And marking the grade of the water consumption, and comparing whether the water consumption in the time period corresponding to the predicted value and the actual value belongs to the same water consumption grade. When the predicted values and the true values of all the corresponding time periods belong to the same water consumption grade, the residential water consumption prediction model reaches the use standard, the residential water consumption prediction model is fixed, the water consumption of a single household is predicted at 0 point every morning, and the total water consumption prediction value of the secondary water supply point is obtained through summarization; otherwise, continuing data acquisition and storing the data in the database, and repeatedly executing the establishment of the prediction water supply model until the residential water consumption prediction model reaches the use standard.
Example 2:
the embodiment provides a computer device, which comprises a processor and a memory for storing an executable program of the processor, wherein when the processor executes the program stored in the memory, the establishment of the single-household residential water prediction model in the embodiment 1 is realized; the method specifically comprises the following steps: acquiring water consumption data of single-family residents to generate a data set;
dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on all the acquired resident water consumption data; training according to a decision tree algorithm to obtain a single-family resident water consumption prediction model; predicting the water consumption of single-household residents based on a single-household residential water prediction model, and calculating according to the water consumption of the single-household residents to obtain the water consumption of all the residents, namely the water consumption prediction value of the secondary water supply water consumption point; and if the predicted value of the water consumption of the single-user residents and the actual value of the water consumption of the single-user residents do not belong to the same water consumption grade, correcting the prediction model of the water consumption of the single-user residents.
Example 3:
the present embodiment provides a storage medium, which is a computer-readable storage medium, and stores a computer program, when the program is executed by a processor, and the processor executes the computer program stored in the memory, the method of establishing the single-family residential water consumption prediction model according to embodiment 1 is implemented; the method specifically comprises the following steps: acquiring water consumption data of single-family residents to generate a data set; dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on all the acquired resident water consumption data; training according to a decision tree algorithm to obtain a single-family resident water consumption prediction model; predicting the water consumption of single-household residents based on a single-household residential water prediction model, and calculating according to the water consumption of the single-household residents to obtain the water consumption of all the residents, namely the water consumption prediction value of the secondary water supply water consumption point; and if the predicted water consumption value of the single-user residents and the actual water consumption value of the single-user residents do not belong to the same water consumption grade, correcting the single-user resident water consumption prediction model.
The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (5)

1. A secondary water supply amount prediction method based on big data is characterized by comprising the following steps:
establishing a residential water prediction model; the method specifically comprises the following steps:
acquiring resident water consumption data to generate a data set;
the acquiring the residential water data and generating the data set specifically comprises the following steps:
the data to be collected includes: time period of water use
Figure QLYQS_1
Based on the mean temperature>
Figure QLYQS_2
And weather->
Figure QLYQS_3
Season, season>
Figure QLYQS_4
Date, date
Figure QLYQS_5
And the water slope is used for the same time period on the previous day>
Figure QLYQS_6
Present time period water usage>
Figure QLYQS_7
The time of day is divided into 4 time periods, each of which has 6 hours, namely (00;
the water consumption of each household in each time period is calculated by the formula, and the formula is as follows:
Figure QLYQS_8
Figure QLYQS_9
wherein the content of the first and second substances,
Figure QLYQS_12
is the same time period on the previous day>
Figure QLYQS_19
Is measured by the water usage slope value of (4)>
Figure QLYQS_22
Is time period->
Figure QLYQS_13
Based on the water quantity>
Figure QLYQS_14
For a time period of the last day>
Figure QLYQS_17
Based on the water quantity>
Figure QLYQS_20
Is the time difference of the time period>
Figure QLYQS_10
Is a period of time
Figure QLYQS_15
The resident water meter reading at the last time node of (4), (4)>
Figure QLYQS_18
Is a time period>
Figure QLYQS_21
Is taken into the previous time period of (4), (v) is taken>
Figure QLYQS_11
Is a time period>
Figure QLYQS_16
The reading of the resident water meter at the last time node of the last time period;
dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on the obtained resident water consumption data;
dividing all the resident water consumption into 5 grades according to normal distribution, and carrying out grade marking on the existing data specifically comprises the following steps:
the interval of the water consumption of residents is divided into 5 parts, the 5 grades of the water consumption of residents are respectively corresponding, the probability of each grade is 0.2, and the probability density function formula of normal distribution is as follows:
Figure QLYQS_23
Figure QLYQS_24
Figure QLYQS_25
wherein the content of the first and second substances,
Figure QLYQS_26
based on the average water consumption of residents>
Figure QLYQS_27
Is the total number of samples in the sample set D, < > is >>
Figure QLYQS_28
Variance of water consumption for residents>
Figure QLYQS_29
Probability density function for water consumption of a resident>
Figure QLYQS_30
Is a sample->
Figure QLYQS_31
The amount of water used;
the grade probability of each resident water consumption is 0.2, namely the integral of the probability density function of the resident water consumption in each grade interval is 0.2, and the integral calculation formula is as follows:
Figure QLYQS_32
wherein the content of the first and second substances,
Figure QLYQS_33
represents the upper and lower limits of the interval; />
Figure QLYQS_34
Indicates a section->
Figure QLYQS_35
The probability of (d); />
The integral according to the first class interval is equal to 0.2, i.e.
Figure QLYQS_38
The integral of the probability density function of time is 0.2, can obtain the grade node of the water consumption of the first resident>
Figure QLYQS_39
And so on to find another 3 rank nodes->
Figure QLYQS_42
The values of (1), namely 5 grade intervals of the water consumption of residents are respectively as follows: />
Figure QLYQS_37
Figure QLYQS_40
、/>
Figure QLYQS_41
、/>
Figure QLYQS_43
、/>
Figure QLYQS_36
Training according to a decision tree algorithm to obtain a respective water consumption prediction model of each household resident;
predicting the water consumption of residents based on a resident water prediction model;
if the predicted value of the water consumption of the residents and the actual value of the water consumption of the residents do not belong to the same water consumption grade, correcting the prediction model of the water consumption of the residents; the model for predicting the water consumption of each household resident trained according to the decision tree algorithm comprises the following steps:
water time period t, average temperature temp, weather, season,The date d and the water slope of the same time period of the previous day are respectively used as attributes of the decision tree, the splitting attribute of the current decision tree node is determined by respectively calculating the splitting information, the splitting information gain and the splitting information gain rate of each attribute, and the average temperature is used
Figure QLYQS_44
For example, the calculation formula is as follows:
Figure QLYQS_45
wherein the content of the first and second substances,
Figure QLYQS_48
is attribute->
Figure QLYQS_51
Set of training data->
Figure QLYQS_54
Divide into->
Figure QLYQS_47
Sub-data set, <' > or>
Figure QLYQS_50
Is the first->
Figure QLYQS_53
Number of samples in the subset; />
Figure QLYQS_55
For pre-split data sets>
Figure QLYQS_46
Is greater than or equal to>
Figure QLYQS_49
Represents an attribute pick>
Figure QLYQS_52
The splitting information of (a);
Figure QLYQS_56
Figure QLYQS_57
Figure QLYQS_58
wherein the content of the first and second substances,
Figure QLYQS_61
water level for the class label, i.e. the resident, based on>
Figure QLYQS_63
For the classification label->
Figure QLYQS_66
In the training data set->
Figure QLYQS_62
Is present at a frequency of->
Figure QLYQS_65
The number of the classification labels is constant 5, namely 5 water consumption levels of residents; />
Figure QLYQS_68
Set for training samples>
Figure QLYQS_70
Is entropy of the information of (4), is greater than or equal to>
Figure QLYQS_59
Is attribute->
Figure QLYQS_64
The entropy of the information after the split is,
Figure QLYQS_67
is attribute->
Figure QLYQS_69
Split training sample set->
Figure QLYQS_60
The latter information gain;
Figure QLYQS_71
wherein the content of the first and second substances,
Figure QLYQS_74
as an attribute>
Figure QLYQS_78
Splitting training data set->
Figure QLYQS_81
The latter information gain rate, for the attribute->
Figure QLYQS_73
And attribute->
Figure QLYQS_77
An attribute @>
Figure QLYQS_79
And attribute->
Figure QLYQS_83
The attribute is obtained after the calculation>
Figure QLYQS_72
And attribute->
Figure QLYQS_76
And attribute->
Figure QLYQS_80
And attribute->
Figure QLYQS_82
Information gain of, i.e. < >>
Figure QLYQS_75
Figure QLYQS_84
Selecting the attribute with the maximum information gain rate as the splitting attribute of the node of the current decision tree, and gradually selecting the splitting attribute of the node of the next level of the decision tree by a recursive method until the construction of a decision tree model is completed, namely, generating an initial residential water consumption decision tree prediction model;
solving the overfitting problem of a decision tree algorithm by a PEP pruning method, firstly, the misjudgment rate of a subtree to be pruned needs to be obtained, and the specific calculation formula is as follows:
Figure QLYQS_85
wherein the content of the first and second substances,
Figure QLYQS_86
is an adjustable parameter as a penalty factor>
Figure QLYQS_87
Is the number of leaf nodes of a subtree>
Figure QLYQS_88
Number of misjudged samples for a leaf node of a subtree>
Figure QLYQS_89
The number of samples for a leaf node;
the mean value and standard deviation of the times of misjudgment of the subtree are as follows:
Figure QLYQS_90
Figure QLYQS_91
when the subtree is replaced by the leaf node, that is, all the leaf nodes of the subtree are merged into the current node and taken as the leaf node, the misjudgment rate of the leaf node is as follows:
Figure QLYQS_92
the mean value of the misjudgment times of the leaf nodes is as follows:
Figure QLYQS_93
the pruning conditions are as follows:
Figure QLYQS_94
and when the pruning condition is met, executing pruning operation, combining all leaf nodes of the subtrees as leaf nodes to replace the subtrees, and performing pruning calculation to obtain a complete residential water consumption prediction model.
2. The big data based secondary water supply amount prediction method according to claim 1, wherein the predicting the residential water usage amount based on the residential water usage prediction model comprises:
time period of water consumption
Figure QLYQS_95
Based on the mean temperature>
Figure QLYQS_96
And weather->
Figure QLYQS_97
Season, season>
Figure QLYQS_98
And date->
Figure QLYQS_99
And the water slope is used for the same time period on the previous day>
Figure QLYQS_100
The water consumption is input into a resident water consumption prediction model as an input parameter, and the water consumption of residents of each household in each time period in one day is predicted;
accumulating the water consumption predicted values of all residents in all time periods;
and outputting the daily water consumption predicted value of the secondary water supply water point.
3. The big data-based secondary water supply prediction method according to claim 1, wherein if the predicted residential water consumption value and the actual residential water consumption value do not belong to the same water consumption class, the modifying the residential water prediction model specifically comprises:
obtaining the real water consumption value of single household resident in one day
Figure QLYQS_101
The actual water consumption value matrix of the single-family residents in one day is as follows:
Figure QLYQS_102
Figure QLYQS_105
is a single household resident>
Figure QLYQS_108
The real value of the water consumption of the residents in the time period is judged and judged>
Figure QLYQS_111
Is a single household resident>
Figure QLYQS_104
The real value of the water consumption of the residents in the time period is judged and judged>
Figure QLYQS_107
Is a single household resident>
Figure QLYQS_110
The real value of the water consumption of the residents in the time period is judged and judged>
Figure QLYQS_112
Is a single household resident>
Figure QLYQS_103
The real value of the water consumption of residents in the time period is used for predicting the water consumption of residents
Figure QLYQS_106
And the real value of the water consumption of the residents>
Figure QLYQS_109
Marking the water consumption grade, comparing whether the water consumption of the time periods corresponding to the predicted value and the actual value belongs to the same water consumption grade, and when the predicted value and the actual value of all the corresponding time periods belong to the same water consumption grade, the resident water consumption prediction model already reaches the use standard;
and if the use standard is not met, re-establishing the residential water prediction model.
4. A computer device comprising a processor and a memory for storing processor-executable programs, the computer device performing the method of any of claims 1 to 3 when the processor executes the programs stored in the memory.
5. A storage medium, characterized by storing a program which, when executed by a processor, performs the method of any one of claims 1 to 3.
CN202211373554.7A 2022-11-04 2022-11-04 Secondary water supply amount prediction method and device based on big data and storage medium Active CN115423224B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211373554.7A CN115423224B (en) 2022-11-04 2022-11-04 Secondary water supply amount prediction method and device based on big data and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211373554.7A CN115423224B (en) 2022-11-04 2022-11-04 Secondary water supply amount prediction method and device based on big data and storage medium

Publications (2)

Publication Number Publication Date
CN115423224A CN115423224A (en) 2022-12-02
CN115423224B true CN115423224B (en) 2023-04-18

Family

ID=84208181

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211373554.7A Active CN115423224B (en) 2022-11-04 2022-11-04 Secondary water supply amount prediction method and device based on big data and storage medium

Country Status (1)

Country Link
CN (1) CN115423224B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116432863A (en) * 2023-05-18 2023-07-14 安徽舜禹水务股份有限公司 Integral peak-shifting scheduling method for secondary water supply based on mathematical programming

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036668A (en) * 2020-09-30 2020-12-04 北京百度网讯科技有限公司 Water consumption prediction method, device, electronic equipment and computer readable medium
CN114792169A (en) * 2022-05-13 2022-07-26 遥相科技发展(北京)有限公司 Residential water consumption prediction method based on MIC-XGboost algorithm

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110895969B (en) * 2018-09-13 2023-12-15 大连大学 Atrial fibrillation prediction decision tree and pruning method thereof
CN110674985A (en) * 2019-09-20 2020-01-10 北京建筑大学 Urban resident domestic water consumption prediction method and application thereof
CN114462550A (en) * 2022-02-28 2022-05-10 天津大学 Method for predicting water consumption of water supply network users by using random forest and probability density function

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036668A (en) * 2020-09-30 2020-12-04 北京百度网讯科技有限公司 Water consumption prediction method, device, electronic equipment and computer readable medium
CN114792169A (en) * 2022-05-13 2022-07-26 遥相科技发展(北京)有限公司 Residential water consumption prediction method based on MIC-XGboost algorithm

Also Published As

Publication number Publication date
CN115423224A (en) 2022-12-02

Similar Documents

Publication Publication Date Title
CN112765808B (en) Ecological drought monitoring and evaluating method
Mujumdar et al. Real‐time reservoir operation for irrigation
CN115423224B (en) Secondary water supply amount prediction method and device based on big data and storage medium
CN105005204A (en) Intelligent engine system capable of automatically triggering intelligent home and intelligent life scenes and method
CN110658725B (en) Energy supervision and prediction system and method based on artificial intelligence
CN115411730B (en) Air conditioner load multi-period adjustable potential evaluation method and related device
CN112470888A (en) Automatic watering method and system for smart community
CN108452597A (en) Screen replacing based reminding method and system, the air filter unit of air filter unit
CN116562583A (en) Multidimensional water resource supply and demand prediction method and system
CN109299853B (en) Reservoir dispatching function extraction method based on joint probability distribution
CN114092776A (en) Multi-sensor data fusion method applied to intelligent agriculture
CN115879750B (en) Aquatic seedling environment monitoring management system and method
CN117267905A (en) Air conditioner control method and device, air conditioner and storage medium
Sarmas et al. Baseline energy modeling for improved measurement and verification through the use of ensemble artificial intelligence models
CN116298576A (en) Time-segmentation-considered extensible non-invasive load monitoring method
CN115587661A (en) Livestock and poultry farm air quality optimization system and method based on field boundary indexes
CN108629362A (en) A kind of learning behavior custom discovery quantization system and method towards mobile environment
CN114330136A (en) Water meter based water living condition monitoring method, system, device and storage medium
Khan et al. Irrigation water requirement prediction through various data mining techniques applied on a carefully pre-processed dataset
Rastog et al. Crop and Yield Prediction Through Machine Learning Techniques to Maximize Production: 21st Century Sustainable Approach for Smart Cities 5.0
CN117371591A (en) Power consumer level cooling load identification method
CN109963262A (en) Wireless sensor method for optimizing scheduling in a kind of wireless sensor network
CN113505913B (en) Reservoir optimal scheduling decision method and device for stability of aquatic community system
CN117933946B (en) Rural business management method based on big data
Liu et al. The research of precision irrigation decision support system based on genetic algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant