CN114862618A - Artificial intelligence-based urban water consumption prediction method, device, equipment and medium - Google Patents

Artificial intelligence-based urban water consumption prediction method, device, equipment and medium Download PDF

Info

Publication number
CN114862618A
CN114862618A CN202210425688.2A CN202210425688A CN114862618A CN 114862618 A CN114862618 A CN 114862618A CN 202210425688 A CN202210425688 A CN 202210425688A CN 114862618 A CN114862618 A CN 114862618A
Authority
CN
China
Prior art keywords
water consumption
data
urban water
urban
decomposition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210425688.2A
Other languages
Chinese (zh)
Inventor
雷田子
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202210425688.2A priority Critical patent/CN114862618A/en
Publication of CN114862618A publication Critical patent/CN114862618A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/148Wavelet transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Economics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Algebra (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Quality & Reliability (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)

Abstract

The application provides an artificial intelligence-based urban water consumption prediction method, an artificial intelligence-based urban water consumption prediction device, electronic equipment and a storage medium, wherein the artificial intelligence-based urban water consumption prediction method comprises the following steps: acquiring basic data of urban water consumption; screening abnormal data in the basic data of the urban water consumption, and smoothing the abnormal data to obtain optimized data of the urban water consumption; arranging the urban water consumption optimization data to obtain urban water consumption time sequence data, and decomposing the urban water consumption time sequence data to obtain a plurality of urban water consumption quantum sequences; obtaining a plurality of city water significant characteristic factors through a principal component analysis algorithm; and inputting the city water quantum sequence and the city water significant characteristic factor into a neural network prediction model to obtain a prediction result. According to the method and the device, the optimized wavelet transformation can be utilized to optimize the urban water consumption data, and the urban water consumption can be accurately predicted by combining a neural network.

Description

Artificial intelligence-based urban water consumption prediction method, device, equipment and medium
Technical Field
The application relates to the technical field of artificial intelligence, in particular to an artificial intelligence-based urban water consumption prediction method and device, electronic equipment and a storage medium.
Background
The prediction of accurate urban water consumption in the water supply system is beneficial to realizing the optimal scheduling of urban water, and the reliable and accurate prediction of urban water demand has important significance for constructing an intelligent water supply system and an intelligent city. However, due to the high frequency noise and the complex relationship of water demand trains, it is not easy to predict city water demand.
The traditional statistical modeling method is to predict the urban water consumption by analyzing the relation among different characteristics and judging the functional relation among the characteristics with different indexes, and comprises machine learning methods such as a support vector machine, a decision tree, multiple linear regression and the like. However, the prediction model is simple in structure, the urban water consumption can change along with the influence of time, weather and other conditions, the prediction model has the characteristics of complexity, nonlinearity and time-varying property, and the difference of substituted influence factors can have large influence on the prediction result, so that the prediction result is poor.
Disclosure of Invention
In view of the above, there is a need for an artificial intelligence-based urban water consumption prediction method, device, electronic device and storage medium, so as to solve the technical problem of how to improve the accuracy of urban water consumption prediction.
The application provides an artificial intelligence-based urban water consumption prediction method, which comprises the following steps:
acquiring historical data of daily urban water consumption as urban water consumption basic data;
screening abnormal data in the basic data of the urban water consumption, and smoothing the abnormal data to obtain optimized data of the urban water consumption;
sequentially arranging the urban water consumption optimization data according to a time sequence to obtain urban water consumption time sequence data, and decomposing the urban water consumption time sequence data to obtain a plurality of urban water consumption quantum sequences;
carrying out principal component analysis on a plurality of preset urban water characteristic factors influencing the urban water consumption to obtain a plurality of urban water significant characteristic factors;
and sending the urban water quantum sequence and the urban water significant characteristic factor into a neural network prediction model for training, and predicting the urban water consumption by using the trained neural network prediction model to obtain a prediction result.
In some embodiments, the obtaining historical daily municipal water usage data as the municipal water usage base data comprises:
building a one-stop data analysis platform;
and searching historical data of daily urban water consumption based on the one-stop data analysis platform, and taking the obtained historical data of the daily urban water consumption as urban water basic data.
In some embodiments, the screening abnormal data in the basic data of the municipal water consumption and smoothing the abnormal data to obtain the optimized data of the municipal water consumption comprises:
screening the basic data of the urban water consumption according to a 3 sigma criterion to obtain abnormal data;
and smoothing the abnormal data according to a user-defined smoothing model, and taking the smoothed basic data of the urban water consumption without the abnormal data as the optimized data of the urban water consumption.
In some embodiments, the arranging the municipal water consumption optimization data in sequence according to time sequence to obtain municipal water consumption time series data, and decomposing the municipal water consumption time series data to obtain a plurality of municipal water consumption quantum sequences comprises:
sequentially arranging the urban water consumption optimization data according to the time sequence to obtain urban water consumption time sequence data;
and decomposing the urban water consumption time series data layer by layer according to a maximum overlapping discrete wavelet transform algorithm to obtain an optimal decomposition layer number, and decomposing the urban water consumption time series data into a plurality of urban water consumption quantum sequences based on the optimal decomposition layer number.
In some embodiments, the decomposing the municipal water consumption time-series data layer by layer according to a maximum overlap discrete wavelet transform algorithm to obtain an optimal number of decomposition layers comprises:
calculating the root mean square error of a subsequence corresponding to each layer of wavelet decomposition of the urban water consumption time series data;
and respectively and sequentially comparing the root mean square error of the subsequence corresponding to the wavelet decomposition of the new layer with the root mean square error of the subsequence corresponding to the wavelet decomposition of the previous layer, if the root mean square error corresponding to the wavelet decomposition of the new layer is smaller than the root mean square error corresponding to the wavelet decomposition of the previous layer, continuing the layer-by-layer decomposition until the root mean square error corresponding to the wavelet decomposition of the new layer is not smaller than the root mean square error corresponding to the wavelet decomposition of the previous layer, ending the decomposition, and taking the corresponding decomposition layer number as the optimal decomposition layer number when the decomposition is ended.
In some embodiments, the performing a principal component analysis on a plurality of preset city water characteristic factors affecting the city water consumption to obtain a plurality of city water significant characteristic factors includes:
performing principal component analysis on the urban water characteristic factors to obtain characteristic contribution rates of combinations of a plurality of different urban water characteristic factors;
and comparing a preset threshold with the characteristic contribution rate, and taking all urban water characteristic factors corresponding to the combination of the urban water characteristic factors of which the characteristic contribution rate is greater than the preset threshold as urban water significant characteristic factors.
In some embodiments, the training the city water quantum sequence and the city water significant characteristic factor in a neural network prediction model, and the predicting the city water consumption by using the trained neural network prediction model to obtain the prediction data of the city water consumption includes:
sending the city water quantum sequence and the city water significant characteristic factor into a neural network prediction model for training;
and predicting the urban water consumption by using the trained neural network prediction model, and taking the average value of all obtained prediction results as urban water consumption prediction data.
The embodiment of this application still provides an urban water consumption prediction device based on artificial intelligence, the device includes:
the acquisition unit is used for acquiring historical data of daily urban water consumption as urban water consumption basic data;
the screening unit is used for screening abnormal data in the basic data of the urban water consumption and smoothing the abnormal data to obtain optimized data of the urban water consumption;
the decomposition unit is used for sequentially arranging the urban water consumption optimization data according to a time sequence to obtain urban water consumption time sequence data, and decomposing the urban water consumption time sequence data to obtain a plurality of urban water consumption quantum sequences;
the analysis unit is used for carrying out principal component analysis on a plurality of preset urban water characteristic factors influencing the urban water consumption to obtain a plurality of urban water significant characteristic factors;
and the prediction unit is used for sending the urban water quantum sequence and the urban water significant characteristic factor into a neural network prediction model for training, and predicting the urban water consumption by using the trained neural network prediction model to obtain a prediction result.
An embodiment of the present application further provides an electronic device, where the electronic device includes:
a memory storing at least one instruction;
and the processor executes the instructions stored in the memory to realize the artificial intelligence-based urban water consumption prediction method.
The embodiment of the present application further provides a computer-readable storage medium, in which at least one instruction is stored, and the at least one instruction is executed by a processor in an electronic device to implement the artificial intelligence based urban water consumption prediction method.
This application is through decomposing into a plurality of scattered city water quantum sequence with the information that mixes originally among the city water consumption time series data, and the inside time series information of city water consumption time series data can be expressed and catch more effectively and accurately to combine neural network to fuse all characteristic data, thereby eliminate the redundancy of information between each mode, reduce the interference of different characteristics, carry out accurate prediction to city water consumption.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of an artificial intelligence based municipal water usage prediction method to which the present application relates.
FIG. 2 is a functional block diagram of a preferred embodiment of an artificial intelligence based municipal water usage prediction apparatus according to the present application.
Fig. 3 is a schematic structural diagram of an electronic device according to a preferred embodiment of the method for predicting urban water consumption based on artificial intelligence in accordance with the present application.
Detailed Description
For a clearer understanding of the objects, features and advantages of the present application, reference is made to the following detailed description of the present application along with the accompanying drawings and specific examples. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict. In the following description, numerous specific details are set forth to provide a thorough understanding of the present application, and the described embodiments are merely a subset of the embodiments of the present application and are not intended to be a complete embodiment.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The embodiment of the Application provides an artificial intelligence-based urban water consumption prediction method, which can be applied to one or more electronic devices, wherein the electronic devices are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and hardware of the electronic devices includes but is not limited to a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device and the like.
The electronic device may be any electronic product capable of performing human-computer interaction with a client, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an Internet Protocol Television (IPTV), an intelligent wearable device, and the like.
The electronic device may also include a network device and/or a client device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers.
The Network where the electronic device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
Fig. 1 is a flow chart of a preferred embodiment of the present invention of an artificial intelligence based urban water consumption prediction method. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
And S10, acquiring historical data of daily urban water consumption as urban water consumption basic data.
In an alternative embodiment, the obtaining historical daily municipal water consumption data as the basic municipal water consumption data comprises:
s101, a one-stop data analysis platform is set up.
S102, searching historical data of daily urban water consumption based on the one-stop data analysis platform, and taking the obtained historical data of the daily urban water consumption as urban water basic data.
The one-stop data analysis platform comprises the processes of data acquisition, data integration, data processing, data visualization and the like, can realize the functions of information reading, analysis, interaction and the like, and in the optional embodiment, a hundred million letter ABI one-stop data analysis platform can be used and supports access to common databases such as Mysql, Oracle and the like.
In the optional embodiment, the historical data of daily urban water consumption can be extracted in a data analysis platform database in a matching manner by using an intelligent search engine, the intelligent search is a new generation search engine combined with an artificial intelligence technology, and the intelligent search engine can provide the functions of traditional quick retrieval, relevance sequencing and the like, and can also provide the functions of semantic understanding, intelligent informatization filtering, pushing and the like of contents. The intelligent analysis of the query conditions by the intelligent search mainly comprises the following steps:
extracting effective components in the query conditions, including vocabularies and logical relations;
synonyms, synonyms and related words of the keywords are obtained through the database.
In this alternative embodiment, the acquired historical data of daily municipal water consumption is used as the basic municipal water data.
Therefore, historical data of daily urban water consumption can be rapidly acquired according to the one-stop data analysis platform, and the data acquisition efficiency is greatly improved compared with a manual statistics mode.
And S11, screening abnormal data in the basic data of the urban water consumption, and smoothing the abnormal data to obtain optimized data of the urban water consumption.
In an optional embodiment, the screening abnormal data in the basic data of the municipal water consumption, and smoothing the abnormal data to obtain the optimized data of the municipal water consumption comprises:
and S111, screening the basic data of the urban water consumption according to a 3 sigma criterion to obtain abnormal data.
And S112, smoothing the abnormal data according to the user-defined smoothing model, and taking the smoothed basic data of the urban water consumption without the abnormal data as the optimized data of the urban water consumption.
In this alternative embodiment, the 3 σ criterion is also referred to as a lai-ta criterion, and means that a group of detected data is assumed to contain only random errors, the detected data is calculated to obtain a standard deviation, an interval is determined according to a certain probability, and it is considered that an error exceeding the interval is not a random error but a coarse error, and data containing the error should be removed.
In the alternative embodiment, the abnormal data may be caused by various errors, and the existence of the abnormal data may have a great influence on data analysis, so that the scheme adopts the 3 sigma criterion to screen the urban water consumption basic data. The specific process is as follows: calculating the average value mu and the standard deviation sigma of all the basic data of the urban water consumption, reserving the basic data of the urban water consumption within the [ mu-3 sigma, mu +3 sigma ] interval, and taking all the basic data of the urban water consumption not within the [ mu-3 sigma, mu +3 sigma ] interval as abnormal data.
In this optional embodiment, the abnormal data may be smoothed according to a custom smoothing model, and the smoothed basic data of the municipal water consumption that does not contain the abnormal data may be used as the optimized data of the municipal water consumption. Wherein the custom smooth model satisfies the relation:
E t =θ t-k X t-k +…+θ t-1 X t-1t+1 X t+1 +…+θ t+k X t+k
wherein E is t Representing the t-th anomaly data X t Smoothed value, X t-k And theta t-k Respectively representing the kth historical data near the tth abnormal data and the corresponding weighted value and the distance abnormal data X t The closer the weight value is, the larger the corresponding weight value is, k represents a positive integer, in the scheme, k is 5,
Figure BDA0003608415820000071
illustratively, for exception data X t Taking the first 5 and the last 5 adjacent urban water consumption basic data to form a sequence, and obtaining distance abnormal data X t The more recent, the larger the corresponding weighted value, then
Figure BDA0003608415820000081
By analogy, all the weighted values can be calculated.
In this alternative embodiment, the basic data of the municipal water consumption, which is smoothed so as not to contain the abnormal data, is used as the municipal water consumption optimization data.
Therefore, the influence of the urban water consumption basic data near the abnormal data on the abnormal data can be considered, and the abnormal data is subjected to smoothing processing through weighting and summing, so that the abnormal data value after smoothing processing is more real and accurate.
And S12, sequentially arranging the urban water consumption optimization data according to the time sequence to obtain urban water consumption time sequence data, and decomposing the urban water consumption time sequence data to obtain a plurality of urban water consumption quantum sequences.
In an optional embodiment, the sequentially arranging the municipal water consumption optimization data according to a time sequence to obtain municipal water consumption time-series data, and decomposing the municipal water consumption time-series data to obtain a plurality of municipal water quantum sequences includes:
and S121, sequentially arranging the urban water consumption optimization data according to the time sequence to obtain urban water consumption time sequence data.
And S122, decomposing the urban water consumption time sequence data layer by layer according to a maximum overlapping discrete wavelet transform algorithm to obtain an optimal decomposition layer number, and decomposing the urban water consumption time sequence data into a plurality of urban water quantum sequences based on the optimal decomposition layer number.
In this optional embodiment, the urban water consumption optimization data is first arranged in sequence according to the time sequence, and the sorted urban water consumption optimization data is used as urban water consumption time series data.
In this alternative embodiment, the urban water consumption time-series data may be decomposed layer by layer according to a maximum overlap discrete wavelet transform algorithm to obtain an optimal number of decomposition layers. The Maximum Overlap Discrete Wavelet Transform (MODWT) is a modified Discrete Wavelet Transform (DWT), and the MODWT overcomes the problem of data point reduction caused by down-sampling, thereby improving the accuracy of the wavelet transform.
In this alternative embodiment, the wavelet transform is a signal decomposition model, and the result of each layer of decomposition is that the low-frequency signal obtained from the last decomposition is decomposed into two parts, i.e. low-frequency and high-frequency. After N layers of decomposition, the source signal X is decomposed into: x ═ D1+ D2+ ·+ DN + AN, where D1, D2,. DN are high-frequency signals decomposed by the first layer, the second layer, and the nth layer, respectively, and AN is a low-frequency signal decomposed by the nth layer. However, the more the sub-sequences are decomposed, the longer the prediction time for each sequence is, and therefore, it is necessary to select an appropriate number of decomposition levels as the optimal number of decomposition levels.
In this alternative embodiment, when performing wavelet decomposition on the city water consumption time series data, the degree of frequency of the city water consumption time series data can be characterized by calculating the difference between each adjacent data in the city water consumption time series data, that is, the smaller the difference between each adjacent data, the lower the frequency of the part of the data, and the larger the difference between each adjacent data, the higher the frequency of the part of the data, because the larger the difference between each adjacent data, the more obvious the fluctuation between the corresponding data, and the higher the frequency, and the smaller the difference between each adjacent data, the less obvious the fluctuation between the corresponding data, and thus the lower the frequency. After N layers of decomposition, the urban water consumption time series data X is decomposed into: and X is D1+ D2+. + DN + AN, wherein D1, D2,. DN are urban water quantum sequences corresponding to the high-frequency part obtained by decomposing the first layer, the second layer and the Nth layer respectively, and AN is AN urban water quantum sequence corresponding to the low-frequency part obtained by decomposing the Nth layer.
Illustratively, the existing city water consumption time series data 20, 21, 25, 28, 50, 80 correspond to low frequency parts 20, 21, 25, 28 when the first layer decomposition is performed, and correspond to high frequency parts 50, 80, and the low frequency parts 20, 21 and high frequency parts 25, 28 obtained by continuing the second layer decomposition of the obtained low frequency parts 20, 21, 25, 28 are 20, 21.
In this alternative embodiment, the optimal number of decomposition layers may be obtained by calculating a root mean square error of a subsequence corresponding to wavelet decomposition of each layer of the municipal water consumption time series data, where for the municipal water consumption time series data X (t) ([ X) ] 0 ,X 1 ,…,X t ]The corresponding root mean square error is:
Figure BDA0003608415820000091
wherein, RMSE j Indicates the root mean square error, X, corresponding to the number of decomposition layers of j i Data representing the ith in time-series data of municipal water consumption, D i,j And the detail coefficient is used for representing the high-frequency part information of the urban water consumption time series data when the number of decomposition layers is j, the detail coefficient is acquired after the urban water consumption time series data are converted into a frequency domain, and N is the total amount of the urban water consumption time series data.
In this optional embodiment, the root mean square error of the subsequence corresponding to the new layer of wavelet decomposition and the root mean square error of the subsequence corresponding to the previous layer of wavelet decomposition are respectively and sequentially compared, if the root mean square error corresponding to the new layer of wavelet decomposition is smaller than the root mean square error corresponding to the previous layer of wavelet decomposition, the layer-by-layer decomposition is continued until the root mean square error corresponding to the new layer of wavelet decomposition is not smaller than the root mean square error corresponding to the previous layer of wavelet decomposition, the decomposition is finished, and the number of decomposition layers corresponding to the time when the decomposition is finished is taken as the optimal decomposition layer number.
For example, when the existing city water consumption time series data 20, 21, 25, 28, 50, 80 are used, the low frequency part corresponding to the first layer decomposition is 20, 21, 25, 28, the high frequency part is 50, 80, and the calculated detail coefficient is 32, the root mean square error of the first decomposition can be calculated as
Figure BDA0003608415820000101
Figure BDA0003608415820000102
Then the low frequency parts 20, 21 and the high frequency parts 25, 28 obtained by continuing the second layer decomposition of the obtained low frequency parts 20, 21, 25, 28 are calculated as 20, 21, and the calculated detail coefficient is 16, then the root mean square error at the time of the second decomposition can be calculated as
Figure BDA0003608415820000103
Continuously decomposing the low-frequency parts 20 and 21 to obtain two numerical values 20 and 21, and calculating to obtain a root mean square error of the third decomposition as 8 by setting the detail coefficient obtained by calculation to be 8
Figure BDA0003608415820000104
Due to 12.5>8.1, the optimal number of decomposition levels is thus 2.
In this alternative embodiment, the city water amount time-series data is decomposed into a plurality of city water quantum sequences according to the obtained optimal decomposition layer number.
Therefore, the originally mixed information in the urban water consumption time sequence data is decomposed into a plurality of scattered urban water consumption quantum sequences, the time sequence information inside the urban water consumption time sequence data can be more effectively and accurately expressed and captured, and the prediction precision can be improved in the subsequent process.
And S13, carrying out principal component analysis on a plurality of preset urban water characteristic factors influencing the urban water consumption to obtain a plurality of urban water significant characteristic factors.
In an optional embodiment, the performing principal component analysis on a plurality of preset city water characteristic factors affecting the city water consumption to obtain a plurality of city water significant characteristic factors includes:
s131, carrying out principal component analysis on the urban water characteristic factors to obtain characteristic contribution rates of combinations of a plurality of different urban water characteristic factors.
S132, comparing a preset threshold value with the characteristic contribution rate, and taking all urban water characteristic factors corresponding to the combination of the urban water characteristic factors of which the characteristic contribution rate is greater than the preset threshold value as the urban water significant characteristic factors.
In this alternative embodiment, the preset multiple city water characteristic factors affecting the city water consumption may be temperature, reservoir water level, rainfall, peak water consumption, city population number, factory number, and the like.
In this alternative embodiment, Principal Component Analysis (PCA) is a commonly used data Analysis method. The PCA transforms raw data into a set of linearly independent representations of each dimension through linear transformation, can be used for extracting main characteristic components of the data, and is commonly used for dimensionality reduction and multi-feature selection of high-dimensional data.
In the optional embodiment, an intelligent search engine is used for obtaining daily historical data of each urban water use characteristic factor, the quantity of which is the same as that of the urban water use basic data, in a data analysis platform database, the daily historical data corresponding to each urban water use characteristic factor is sequenced according to time sequence and then is used as an urban water use characteristic factor vector, each urban water use characteristic factor vector is a vector with 1 row and M columns, and N characteristic factors are shared, so that all the urban water use characteristic factor vectors can form a data matrix with N rows and M columns. Obtaining all principal component vectors of the data matrix by using a PCA algorithm, wherein the principal component direction is a direction vector of data main characteristic distribution, and the process of obtaining the urban water significant characteristic factor by PCA comprises the following steps:
the first step is as follows: calculating the mean value of M numerical values in each row in a data matrix with N rows and M columns, and subtracting the mean value of the row from the M numerical values in each row to obtain a centralized data matrix X, wherein the centralized data matrix X is also N rows and M columns;
the second step is that: calculating a covariance matrix C, wherein a calculation formula of the covariance matrix C is as follows:
Figure BDA0003608415820000111
wherein X is the centralized data matrix, X T The covariance matrix C is a matrix of N rows and N columns;
the third step: and calculating an eigenvalue of a covariance matrix C and a corresponding eigenvector, wherein the eigenvector is the principal component vector, N principal component vectors can be obtained in total, the principal component vector is a vector of 1 row and M columns, and the eigenvalue corresponds to the principal component vector one by one.
The fourth step: n principal component vectors are obtained, and each principal component vector corresponds to one characteristic value. According to the idea of the PCA algorithm, the larger the eigenvalue is, the more data characteristics are reserved on the principal component vector corresponding to the eigenvalue. Therefore, the sum of all the eigenvalues is calculated as a sum value, the eigenvalues of the obtained N principal component vectors are sorted from large to small, then the sorted eigenvalues are sequentially summed from large to small to obtain a sum value, and the ratio of the sum value to the sum value is used as the feature contribution rate.
The fifth step: and comparing a preset threshold with the characteristic contribution rate, and taking all urban water characteristic factors corresponding to the combination of the urban water characteristic factors of which the characteristic contribution rate is greater than the preset threshold for the first time as the urban water significant characteristic factors. Wherein the preset threshold may be set to 0.85.
Illustratively, there are 4 principal component vectors, and the feature values corresponding to each principal component vector are respectively 2, 1, and 0.2, then the sum value obtained by summing all the feature values is 4.2, the sorted feature values are sequentially summed from large to small to obtain a sum value, the sum value obtained by summing 2 and 1 for the first time is 3, at this time, the corresponding feature contribution rate is 0.71, and is smaller than the preset threshold, so the sum value obtained by continuously summing 2, 1, and 1 is 4, at this time, the corresponding feature contribution rate is 0.95, and is larger than the preset threshold, and therefore, the 3 city water feature factors corresponding to the feature values 2, 1, and 1 at this time are used as the city water significant feature factors.
Therefore, the main component analysis of the urban water characteristic factors can know which urban water characteristic factors have larger influence on the urban water consumption, and the urban water characteristic factors with larger influence are used as the urban water obvious characteristic factors, so that the urban water consumption can be accurately evaluated in the subsequent process.
And S14, sending the urban water quantum sequence and the urban water significant characteristic factor into a neural network prediction model for training, and predicting the urban water consumption by using the trained neural network prediction model to obtain a prediction result.
In an optional embodiment, the sending the city water quantum sequence and the city water significant characteristic factor into a neural network prediction model for training, and predicting the city water consumption by using the trained neural network prediction model to obtain a prediction result includes:
and S141, sending the urban water quantum sequence and the urban water significant characteristic factor into a neural network prediction model for training.
And S142, predicting the urban water consumption by using the trained neural network prediction model, and taking the average value of all obtained prediction results as urban water consumption prediction data.
In this alternative embodiment, the neural network prediction model may use the LSTM long-short term memory cycling neural network model. The long-short term memory recurrent neural network (LSTM) comprises an input gate, an output gate and a forgetting gate, and information is selectively processed by controlling different gate control switches. The long-short term memory recurrent neural network is developed from the Recurrent Neural Network (RNN), has strong generalization capability, and can well solve the problems of high frequency, high fluctuation and the like of time series data so as to realize high-precision prediction in short term.
In the optional embodiment, the urban water quantum sequence obtained by decomposition can describe the fluctuation condition of the urban water consumption in each stage more accurately and in detail, so that the problems of high frequency, high fluctuation and the like of time series data can be solved well, and high-precision prediction in short time is realized. The process of training the LSTM deep learning network model comprises the following steps:
firstly, taking data corresponding to all the municipal water supply sub-sequences as output, taking the significant characteristic factor data of each municipal water supply corresponding to all the municipal water supply sub-sequences as input to form input-output pairs, taking 80% of the input-output pairs as training data, and taking the other 20% of the input-output pairs as a verification set;
simultaneously, taking the data values corresponding to all the urban water consumption sub-sequences as output label values, and taking the data values of the urban water significant characteristic factors corresponding to all the urban water consumption sub-sequences as the label values of the urban water significant characteristic factor data;
and then continuously selecting a corresponding number of input-output pairs each time according to the data volume contained in the water quantum sequence for each city and training the input-output pairs through an LSTM model, wherein the loss function uses a mean square error function. And (3) obtaining an ideal trained LSTM network when the loss value of the final loss function is 0 through continuous iteration optimization training of the loss function.
In the alternative embodiment, the trained LSTM model is used to predict the municipal water consumption, and the average of all the prediction results is used as the municipal water consumption prediction data.
Illustratively, the city water quantum sequences are city water consumption data of 7 days, 5 days and 3 days continuously, and the city water significant characteristic factor data of 7 days, 5 days and 3 days is selected for predicting the next city water consumption data each time when the LSTM model is trained. And (3) predicting the urban water consumption data of the next day by using a trained LSTM model according to the urban water significant characteristic factor data of the last 7 days, 5 days and 3 days to obtain prediction results of 10000 tons, 20000 tons and 30000 tons respectively, and taking the average value of 20000 tons of all the final prediction results as the final urban water consumption prediction data.
Therefore, high-precision prediction of urban water consumption in a short time can be realized by integrating a plurality of predicted values.
Referring to fig. 2, fig. 2 is a functional block diagram of a preferred embodiment of the present invention for an artificial intelligence-based urban water consumption prediction device. The artificial intelligence-based urban water consumption prediction apparatus 11 includes an acquisition unit 110, a screening unit 111, a decomposition unit 112, an analysis unit 113, and a prediction unit 114. A module/unit as referred to herein is a series of computer readable instruction segments capable of being executed by the processor 13 and performing a fixed function, and is stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
In an alternative embodiment, the obtaining unit 110 is used for obtaining historical data of daily municipal water consumption as the basic data of the municipal water consumption.
In an alternative embodiment, the obtaining historical daily municipal water consumption data as the basic municipal water consumption data comprises:
building a one-stop data analysis platform;
and searching historical data of daily urban water consumption based on the one-stop data analysis platform, and taking the obtained historical data of the daily urban water consumption as urban water basic data.
The one-stop data analysis platform comprises the processes of data acquisition, data integration, data processing, data visualization and the like, can realize the functions of information reading, analysis, interaction and the like, and in the optional embodiment, a hundred million letter ABI one-stop data analysis platform can be used and supports access to common databases such as Mysql, Oracle and the like.
In the optional embodiment, the historical data of daily urban water consumption can be extracted in a data analysis platform database in a matching manner by using an intelligent search engine, the intelligent search is a new generation search engine combined with an artificial intelligence technology, and the intelligent search engine can provide the functions of traditional quick retrieval, relevance sequencing and the like, and can also provide the functions of semantic understanding, intelligent informatization filtering, pushing and the like of contents. The intelligent analysis of the query conditions by the intelligent search mainly comprises the following steps:
extracting effective components in the query conditions, including vocabularies and logical relations;
synonyms, synonyms and related words of the keywords are obtained through the database.
In this alternative embodiment, the acquired historical data of daily municipal water consumption is used as the basic municipal water data.
In an optional embodiment, the screening unit 111 is configured to screen abnormal data in the basic data of the municipal water consumption, and perform smoothing processing on the abnormal data to obtain the optimized data of the municipal water consumption.
In an optional embodiment, the screening abnormal data in the basic data of the urban water consumption, and smoothing the abnormal data to obtain the optimized data of the urban water consumption comprises:
screening the basic data of the urban water consumption according to a 3 sigma criterion to obtain abnormal data;
and smoothing the abnormal data according to a user-defined smoothing model, and taking the smoothed basic data of the urban water consumption without the abnormal data as the optimized data of the urban water consumption.
In this alternative embodiment, the 3 σ criterion is also referred to as a lai-ta criterion, and means that a group of detected data is assumed to contain only random errors, the detected data is calculated to obtain a standard deviation, an interval is determined according to a certain probability, and it is considered that an error exceeding the interval is not a random error but a coarse error, and data containing the error should be removed.
In the alternative embodiment, the abnormal data may be caused by various errors, and the existence of the abnormal data may have a great influence on data analysis, so that the scheme adopts the 3 sigma criterion to screen the urban water consumption basic data. The specific process is as follows: calculating the average value mu and the standard deviation sigma of all the basic data of the urban water consumption, reserving the basic data of the urban water consumption within the [ mu-3 sigma, mu +3 sigma ] interval, and taking all the basic data of the urban water consumption not within the [ mu-3 sigma, mu +3 sigma ] interval as abnormal data.
In this optional embodiment, the abnormal data may be smoothed according to a custom smoothing model, and the smoothed basic data of the municipal water consumption that does not contain the abnormal data may be used as the optimized data of the municipal water consumption. Wherein the custom smooth model satisfies the relation:
E t =θ t-k X t-k +…+θ t-1 X t-1t+1 X t+1 +…+θ t+k X t+k
wherein E is t Representing the t-th anomaly data X t Smoothed value, X t-k And theta t-k Respectively representing k-th history data near the abnormal data and corresponding weighted value, and distance from the abnormal data X t The closer the weight value is, the larger the corresponding weight value is, k represents a positive integer, in the scheme, k is 5,
Figure BDA0003608415820000161
illustratively, for exception data X t Taking the basic data of the water consumption of the first 5 adjacent cities and the basic data of the water consumption of the last 5 adjacent cities to form a sequence, and obtaining distance abnormal data X t The more recent, the larger the corresponding weighted value, then
Figure BDA0003608415820000162
By analogy, all the weighted values can be calculated.
In this alternative embodiment, the basic data of the municipal water consumption, which is smoothed so as not to contain the abnormal data, is used as the municipal water consumption optimization data.
In an alternative embodiment, the decomposition unit 112 is configured to sequentially arrange the municipal water consumption optimization data according to a time sequence to obtain municipal water consumption time series data, and decompose the municipal water consumption time series data to obtain a plurality of municipal water quantum sequences.
In an optional embodiment, the sequentially arranging the municipal water consumption optimization data according to a time sequence to obtain municipal water consumption time-series data, and decomposing the municipal water consumption time-series data to obtain a plurality of municipal water quantum sequences includes:
sequentially arranging the urban water consumption optimization data according to time sequence to obtain urban water consumption time sequence data;
and decomposing the urban water consumption time series data layer by layer according to a maximum overlapping discrete wavelet transform algorithm to obtain an optimal decomposition layer number, and decomposing the urban water consumption time series data into a plurality of urban water consumption quantum sequences based on the optimal decomposition layer number.
In this optional embodiment, the urban water consumption optimization data is first arranged in sequence according to the time sequence, and the sorted urban water consumption optimization data is used as urban water consumption time series data.
In this alternative embodiment, the urban water consumption time-series data may be decomposed layer by layer according to a maximum overlap discrete wavelet transform algorithm to obtain an optimal number of decomposition layers. The Maximum Overlap Discrete Wavelet Transform (MODWT) is a modified Discrete Wavelet Transform (DWT), and the MODWT overcomes the problem of data point reduction caused by down-sampling, thereby improving the accuracy of the wavelet transform.
In this alternative embodiment, the wavelet transform is a signal decomposition model, and the result of each layer of decomposition is that the low-frequency signal obtained from the last decomposition is decomposed into two parts, i.e. low-frequency and high-frequency. After N layers of decomposition, the source signal X is decomposed into: x ═ D1+ D2+ ·+ DN + AN where D1, D2,. DN are high frequency signals decomposed by the first layer, the second layer, and the nth layer, respectively, and AN is a low frequency signal decomposed by the nth layer. However, the more the sub-sequences are decomposed, the longer the prediction time for each sequence is, and therefore, it is necessary to select an appropriate number of decomposition levels as the optimal number of decomposition levels.
In this alternative embodiment, when performing wavelet decomposition on the city water consumption time series data, the degree of frequency of the city water consumption time series data can be characterized by calculating the difference between each adjacent data in the city water consumption time series data, that is, the smaller the difference between each adjacent data, the lower the frequency of the part of the data, and the larger the difference between each adjacent data, the higher the frequency of the part of the data, because the larger the difference between each adjacent data, the more obvious the fluctuation between the corresponding data, and the higher the frequency, and the smaller the difference between each adjacent data, the less obvious the fluctuation between the corresponding data, and thus the lower the frequency. After N layers of decomposition, the urban water consumption time series data X are decomposed into: and X is D1+ D2+. + DN + AN, wherein D1, D2,. DN are urban water quantum sequences corresponding to the high-frequency part obtained by decomposing the first layer, the second layer and the Nth layer respectively, and AN is AN urban water quantum sequence corresponding to the low-frequency part obtained by decomposing the Nth layer.
Illustratively, the existing city water consumption time series data 20, 21, 25, 28, 50, 80 correspond to low frequency parts 20, 21, 25, 28 when the first layer decomposition is performed, and correspond to high frequency parts 50, 80, and the low frequency parts 20, 21 and high frequency parts 25, 28 obtained by continuing the second layer decomposition of the obtained low frequency parts 20, 21, 25, 28 are 20, 21.
In this alternative embodiment, the root mean square error of the corresponding subsequence of each layer of wavelet decomposition of the city water consumption time series data can be calculated, and the city water consumption time series data X (t) is [ X ] 0 ,X 1 ,…,X t ]The corresponding root mean square error is:
Figure BDA0003608415820000171
wherein,RMSE j Indicates the root mean square error, X, corresponding to the number of decomposition layers of j i Data representing the ith in time-series data of municipal water consumption, D i,j And the detail coefficient is used for representing the high-frequency part information of the urban water consumption time series data when the number of decomposition layers is j, the detail coefficient is acquired after the urban water consumption time series data are converted into a frequency domain, and N is the total amount of the urban water consumption time series data.
In this optional embodiment, the root mean square error of the subsequence corresponding to the new layer of wavelet decomposition and the root mean square error of the subsequence corresponding to the previous layer of wavelet decomposition are respectively and sequentially compared, if the root mean square error corresponding to the new layer of wavelet decomposition is smaller than the root mean square error corresponding to the previous layer of wavelet decomposition, the layer-by-layer decomposition is continued until the root mean square error corresponding to the new layer of wavelet decomposition is not smaller than the root mean square error corresponding to the previous layer of wavelet decomposition, the decomposition is finished, and the number of decomposition layers corresponding to the time when the decomposition is finished is taken as the optimal decomposition layer number.
For example, when the existing city water consumption time series data 20, 21, 25, 28, 50, 80 are used, the low frequency part corresponding to the first layer decomposition is 20, 21, 25, 28, the high frequency part is 50, 80, and the calculated detail coefficient is 32, the root mean square error of the first decomposition can be calculated as
Figure BDA0003608415820000181
Figure BDA0003608415820000182
Then the low frequency parts 20, 21 and the high frequency parts 25, 28 obtained by continuing the second layer decomposition of the obtained low frequency parts 20, 21, 25, 28 are calculated as 20, 21, and the calculated detail coefficient is 16, then the root mean square error at the time of the second decomposition can be calculated as
Figure BDA0003608415820000183
Continuously decomposing the low-frequency parts 20 and 21 to obtain two numerical values 20 and 21, and calculating to obtain a third time by setting the detail coefficient obtained by calculation to be 8Root mean square error at decomposition of
Figure BDA0003608415820000184
Due to 12.5>8.1, the optimal number of decomposition levels is thus 2.
In this alternative embodiment, the city water amount time-series data is decomposed into a plurality of city water quantum sequences according to the obtained optimal decomposition layer number.
In an alternative embodiment, the analysis unit 113 is configured to perform principal component analysis on a plurality of preset city water characteristics factors affecting the city water consumption to obtain a plurality of city water significant characteristics.
In an optional embodiment, the performing principal component analysis on a plurality of preset city water characteristic factors affecting the city water consumption to obtain a plurality of city water significant characteristic factors includes:
performing principal component analysis on the urban water characteristic factors to obtain characteristic contribution rates of combinations of a plurality of different urban water characteristic factors;
and comparing a preset threshold with the characteristic contribution rate, and taking all urban water characteristic factors corresponding to the combination of the urban water characteristic factors of which the characteristic contribution rate is greater than the preset threshold as urban water significant characteristic factors.
In this alternative embodiment, the preset multiple city water characteristic factors affecting the city water consumption may be temperature, reservoir water level, rainfall, peak water consumption, city population number, factory number, and the like.
In this alternative embodiment, Principal Component Analysis (PCA) is a commonly used data Analysis method. The PCA transforms raw data into a set of linearly independent representations of each dimension through linear transformation, can be used for extracting main characteristic components of the data, and is commonly used for dimensionality reduction and multi-feature selection of high-dimensional data.
In the optional embodiment, the daily historical data of the urban water characteristic factors with the same quantity as the urban water consumption basic data are obtained from a data analysis platform database by using an intelligent search engine, the daily historical data corresponding to each urban water characteristic factor is sequenced according to time sequence and then is used as an urban water characteristic factor vector, each urban water characteristic factor vector is a vector with 1 row and M columns, and N characteristic factors are provided in total, so that all the urban water characteristic factor vectors can form a data matrix with N rows and M columns. Obtaining all principal component vectors of the data matrix by using a PCA algorithm, wherein the principal component direction is a direction vector of data main characteristic distribution, and the process of obtaining the urban water significant characteristic factor by PCA comprises the following steps:
the first step is as follows: calculating the mean value of M numerical values in each row in a data matrix with N rows and M columns, and subtracting the mean value of the row from the M numerical values in each row to obtain a centralized data matrix X, wherein the centralized data matrix X is also N rows and M columns;
the second step is that: calculating a covariance matrix C, wherein a calculation formula of the covariance matrix C is as follows:
Figure BDA0003608415820000191
wherein X is the centralized data matrix, X T The covariance matrix C is a matrix of N rows and N columns;
the third step: and calculating an eigenvalue of a covariance matrix C and a corresponding eigenvector, wherein the eigenvector is the principal component vector, N principal component vectors can be obtained in total, the principal component vector is a vector of 1 row and M columns, and the eigenvalue corresponds to the principal component vector one by one.
The fourth step: n principal component vectors are obtained, and each principal component vector corresponds to one characteristic value. According to the idea of the PCA algorithm, the larger the eigenvalue is, the more data characteristics are reserved on the principal component vector corresponding to the eigenvalue. Therefore, the sum of all the eigenvalues is calculated as a sum value, the eigenvalues of the obtained N principal component vectors are sorted from large to small, then the sorted eigenvalues are sequentially summed from large to small to obtain a sum value, and the ratio of the sum value to the sum value is used as the feature contribution rate.
The fifth step: and comparing a preset threshold with the characteristic contribution rate, and taking all urban water characteristic factors corresponding to the combination of the urban water characteristic factors of which the characteristic contribution rate is greater than the preset threshold for the first time as the urban water significant characteristic factors. Wherein the preset threshold may be set to 0.85.
Illustratively, there are 4 principal component vectors, and the feature values corresponding to each principal component vector are respectively 2, 1, and 0.2, then the sum value obtained by summing all the feature values is 4.2, the sorted feature values are sequentially summed from large to small to obtain a sum value, the sum value obtained by summing 2 and 1 for the first time is 3, at this time, the corresponding feature contribution rate is 0.71, and is smaller than the preset threshold, so the sum value obtained by continuously summing 2, 1, and 1 is 4, at this time, the corresponding feature contribution rate is 0.95, and is larger than the preset threshold, and therefore, the 3 city water feature factors corresponding to the feature values 2, 1, and 1 at this time are used as the city water significant feature factors.
In an optional embodiment, the prediction unit 114 is configured to send the city water quantum sequence and the city water significant feature factor into a neural network prediction model for training, and predict the city water consumption by using the trained neural network prediction model to obtain a prediction result.
In an optional embodiment, the sending the city water quantum sequence and the city water significant characteristic factor into a neural network prediction model for training, and predicting the city water consumption by using the trained neural network prediction model to obtain a prediction result includes:
sending the city water quantum sequence and the city water significant characteristic factor into a neural network prediction model for training;
and predicting the urban water consumption by using the trained neural network prediction model, and taking the average value of all obtained prediction results as urban water consumption prediction data.
In this alternative embodiment, the neural network prediction model may use the LSTM long-short term memory cycling neural network model. The long-short term memory recurrent neural network (LSTM) comprises an input gate, an output gate and a forgetting gate, and information is selectively processed by controlling different gate control switches. The long-short term memory recurrent neural network is developed from the Recurrent Neural Network (RNN), has strong generalization capability, and can well solve the problems of high frequency, high fluctuation and the like of time series data so as to realize high-precision prediction in short term.
In the optional embodiment, the urban water quantum sequence obtained by decomposition can describe the fluctuation condition of the urban water consumption in each stage more accurately and in detail, so that the problems of high frequency, high fluctuation and the like of time series data can be solved well, and high-precision prediction in short time is realized. The process of training the LSTM deep learning network model comprises the following steps:
firstly, taking data corresponding to all the municipal water supply sub-sequences as output, taking the significant characteristic factor data of each municipal water supply corresponding to all the municipal water supply sub-sequences as input to form input-output pairs, taking 80% of the input-output pairs as training data, and taking the other 20% of the input-output pairs as a verification set;
simultaneously, taking the data values corresponding to all the urban water consumption sub-sequences as output label values, and taking the data values of the urban water significant characteristic factors corresponding to all the urban water consumption sub-sequences as the label values of the urban water significant characteristic factor data;
and then continuously selecting a corresponding number of input-output pairs each time according to the data volume contained in the water quantum sequence for each city and training the input-output pairs through an LSTM model, wherein the loss function uses a mean square error function. And (3) obtaining an ideal trained LSTM network when the loss value of the final loss function is 0 through continuous iteration optimization training of the loss function.
In the alternative embodiment, the trained LSTM model is used to predict the municipal water consumption, and the average of all the prediction results is used as the municipal water consumption prediction data.
Illustratively, the city water quantum sequences are city water consumption data of 7 days, 5 days and 3 days continuously, and the city water significant characteristic factor data of 7 days, 5 days and 3 days is selected for predicting the next city water consumption data each time when the LSTM model is trained. And (3) predicting the urban water consumption data of the next day by using a trained LSTM model according to the urban water significant characteristic factor data of the last 7 days, 5 days and 3 days to obtain prediction results of 10000 tons, 20000 tons and 30000 tons respectively, and taking the average value of 20000 tons of all the final prediction results as the final urban water consumption prediction data.
According to the technical scheme, the originally mixed information in the urban water consumption time series data is decomposed into the multiple dispersed urban water consumption quantum sequences, the time series information inside the urban water consumption time series data can be more effectively and accurately expressed and captured, and all characteristic data are fused by combining a neural network, so that the redundancy of information among modes is eliminated, the interference of different characteristics is reduced, and the urban water consumption is accurately predicted.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. The electronic device 1 comprises a memory 12 and a processor 13. The memory 12 is used for storing computer readable instructions, and the processor 13 is used for executing the computer readable instructions stored in the memory to realize the artificial intelligence based urban water consumption prediction method of any one of the above embodiments.
In an alternative embodiment, the electronic device 1 further comprises a bus, a computer program stored in said memory 12 and executable on said processor 13, such as an artificial intelligence based municipal water usage prediction program.
Fig. 3 shows only the electronic device 1 with the memory 12 and the processor 13, and it will be understood by those skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
In conjunction with fig. 1, the memory 12 of the electronic device 1 stores a plurality of computer-readable instructions to implement an artificial intelligence based municipal water usage prediction method, and the processor 13 executes the plurality of instructions to implement:
acquiring historical data of daily urban water consumption as urban water consumption basic data;
screening abnormal data in the basic data of the urban water consumption, and smoothing the abnormal data to obtain optimized data of the urban water consumption;
sequentially arranging the urban water consumption optimization data according to a time sequence to obtain urban water consumption time sequence data, and decomposing the urban water consumption time sequence data to obtain a plurality of urban water consumption quantum sequences;
carrying out principal component analysis on a plurality of preset urban water characteristic factors influencing the urban water consumption to obtain a plurality of urban water significant characteristic factors;
and sending the urban water quantum sequence and the urban water significant characteristic factor into a neural network prediction model for training, and predicting the urban water consumption by using the trained neural network prediction model to obtain a prediction result.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
It will be understood by those skilled in the art that the schematic diagram is only an example of the electronic device 1, and does not constitute a limitation to the electronic device 1, the electronic device 1 may have a bus-type structure or a star-shaped structure, the electronic device 1 may further include more or less hardware or software than those shown in the figures, or different component arrangements, for example, the electronic device 1 may further include an input and output device, a network access device, etc.
It should be noted that the electronic device 1 is only an example, and other existing or future electronic products, such as those that may be adapted to the present application, should also be included in the scope of protection of the present application, and are included by reference.
Memory 12 includes at least one type of readable storage medium, which may be non-volatile or volatile. The readable storage medium includes flash memory, removable hard disks, multimedia cards, card type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the electronic device 1, for example a removable hard disk of the electronic device 1. The memory 12 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the electronic device 1. The memory 12 may be used not only to store application software installed in the electronic device 1 and various kinds of data, such as codes of an artificial intelligence-based urban water consumption prediction program, etc., but also to temporarily store data that has been output or will be output.
The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 13 is a Control Unit (Control Unit) of the electronic device 1, connects various components of the electronic device 1 by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules stored in the memory 12 (for example, executing an artificial intelligence-based urban water consumption prediction program, etc.), and calling data stored in the memory 12.
The processor 13 executes an operating system of the electronic device 1 and various installed application programs. The processor 13 executes the application to implement the steps of the various artificial intelligence based municipal water usage prediction method embodiments described above, such as the steps shown in FIG. 1.
Illustratively, the computer program may be partitioned into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present application. The one or more modules/units may be a series of computer-readable instruction segments capable of performing certain functions, which are used to describe the execution of the computer program in the electronic device 1. For example, the computer program may be divided into an acquisition unit 110, a screening unit 111, a decomposition unit 112, an analysis unit 113, a prediction unit 114.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute parts of the artificial intelligence based urban water consumption prediction method according to the embodiments of the present application.
The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the processes in the methods of the embodiments described above may be implemented by a computer program, which may be stored in a computer-readable storage medium and executed by a processor, to implement the steps of the embodiments of the methods described above.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), random-access Memory and other Memory, etc.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 or the like.
The present application further provides a computer-readable storage medium (not shown), in which computer-readable instructions are stored, and the computer-readable instructions are executed by a processor in an electronic device to implement the method for predicting urban water consumption based on artificial intelligence according to any of the above embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the specification may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present application and not for limiting, and although the present application is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present application without departing from the spirit and scope of the technical solutions of the present application.

Claims (10)

1. An artificial intelligence-based urban water consumption prediction method is characterized by comprising the following steps:
acquiring historical data of daily urban water consumption as urban water consumption basic data;
screening abnormal data in the basic data of the urban water consumption, and smoothing the abnormal data to obtain optimized data of the urban water consumption;
sequentially arranging the urban water consumption optimization data according to a time sequence to obtain urban water consumption time sequence data, and decomposing the urban water consumption time sequence data to obtain a plurality of urban water consumption quantum sequences;
carrying out principal component analysis on a plurality of preset urban water characteristic factors influencing the urban water consumption to obtain a plurality of urban water significant characteristic factors;
and sending the urban water quantum sequence and the urban water significant characteristic factor into a neural network prediction model for training, and predicting the urban water consumption by using the trained neural network prediction model to obtain a prediction result.
2. The artificial intelligence based municipal water consumption prediction method according to claim 1, wherein said obtaining historical data of daily municipal water consumption as the basic municipal water consumption data comprises:
building a one-stop data analysis platform;
and searching historical data of daily urban water consumption based on the one-stop data analysis platform, and taking the obtained historical data of the daily urban water consumption as urban water basic data.
3. The artificial intelligence based municipal water consumption prediction method according to claim 1, wherein said screening abnormal data from said basic municipal water consumption data and smoothing said abnormal data to obtain optimized municipal water consumption data comprises:
screening the basic data of the urban water consumption according to a 3 sigma criterion to obtain abnormal data;
and smoothing the abnormal data according to a user-defined smoothing model, and taking the smoothed basic data of the urban water consumption without the abnormal data as the optimized data of the urban water consumption.
4. The artificial intelligence based municipal water consumption prediction method according to claim 1, wherein said arranging the municipal water consumption optimization data in sequence according to time sequence to obtain municipal water consumption time-series data, and said decomposing the municipal water consumption time-series data to obtain a plurality of municipal water quantum sequences comprises:
sequentially arranging the urban water consumption optimization data according to time sequence to obtain urban water consumption time sequence data;
and decomposing the urban water consumption time series data layer by layer according to a maximum overlapping discrete wavelet transform algorithm to obtain an optimal decomposition layer number, and decomposing the urban water consumption time series data into a plurality of urban water consumption quantum sequences based on the optimal decomposition layer number.
5. The artificial intelligence based municipal water consumption prediction method according to claim 4, wherein said decomposing the municipal water consumption time-series data layer by layer according to the maximum overlap discrete wavelet transform algorithm to obtain the optimal number of decomposition layers comprises:
calculating the root mean square error of a subsequence corresponding to each layer of wavelet decomposition of the urban water consumption time series data;
and respectively and sequentially comparing the root mean square error of the subsequence corresponding to the wavelet decomposition of the new layer with the root mean square error of the subsequence corresponding to the wavelet decomposition of the previous layer, if the root mean square error corresponding to the wavelet decomposition of the new layer is smaller than the root mean square error corresponding to the wavelet decomposition of the previous layer, continuing to perform layer-by-layer decomposition until the root mean square error corresponding to the wavelet decomposition of the new layer is not smaller than the root mean square error corresponding to the wavelet decomposition of the previous layer, ending the decomposition, and taking the corresponding decomposition layer number as the optimal decomposition layer number when the decomposition is finished.
6. The artificial intelligence based urban water consumption prediction method according to claim 1, wherein the step of performing principal component analysis on a plurality of preset urban water characteristic factors affecting the urban water consumption to obtain a plurality of urban water significant characteristic factors comprises:
performing principal component analysis on the urban water characteristic factors to obtain characteristic contribution rates of combinations of a plurality of different urban water characteristic factors;
and comparing a preset threshold with the characteristic contribution rate, and taking all urban water characteristic factors corresponding to the combination of the urban water characteristic factors of which the characteristic contribution rate is greater than the preset threshold as urban water significant characteristic factors.
7. The artificial intelligence based urban water consumption prediction method according to claim 1, wherein the step of inputting the urban water quantum sequence and the urban water significant characteristic factor into a neural network prediction model for training, and the step of predicting the urban water consumption by using the trained neural network prediction model to obtain the urban water consumption prediction data comprises the steps of:
sending the city water quantum sequence and the city water significant characteristic factor into a neural network prediction model for training;
and predicting the urban water consumption by using the trained neural network prediction model, and taking the average value of all obtained prediction results as urban water consumption prediction data.
8. An artificial intelligence based municipal water usage prediction device, comprising:
the acquisition unit is used for acquiring historical data of daily urban water consumption as urban water consumption basic data;
the screening unit is used for screening abnormal data in the basic data of the urban water consumption and smoothing the abnormal data to obtain optimized data of the urban water consumption;
the decomposition unit is used for sequentially arranging the urban water consumption optimization data according to a time sequence to obtain urban water consumption time sequence data, and decomposing the urban water consumption time sequence data to obtain a plurality of urban water consumption quantum sequences;
the analysis unit is used for carrying out principal component analysis on a plurality of preset urban water characteristic factors influencing the urban water consumption to obtain a plurality of urban water significant characteristic factors;
and the prediction unit is used for sending the urban water quantum sequence and the urban water significant characteristic factor into a neural network prediction model for training, and predicting the urban water consumption by using the trained neural network prediction model to obtain a prediction result.
9. An electronic device, characterized in that the electronic device comprises:
a memory storing computer readable instructions; and
a processor executing computer readable instructions stored in the memory to implement the artificial intelligence based municipal water usage prediction method of any of claims 1 to 7.
10. A computer readable storage medium having computer readable instructions stored thereon which, when executed by a processor, implement the artificial intelligence based municipal water usage prediction method of any one of claims 1 to 7.
CN202210425688.2A 2022-04-21 2022-04-21 Artificial intelligence-based urban water consumption prediction method, device, equipment and medium Pending CN114862618A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210425688.2A CN114862618A (en) 2022-04-21 2022-04-21 Artificial intelligence-based urban water consumption prediction method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210425688.2A CN114862618A (en) 2022-04-21 2022-04-21 Artificial intelligence-based urban water consumption prediction method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN114862618A true CN114862618A (en) 2022-08-05

Family

ID=82634012

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210425688.2A Pending CN114862618A (en) 2022-04-21 2022-04-21 Artificial intelligence-based urban water consumption prediction method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN114862618A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117609812A (en) * 2023-12-07 2024-02-27 中科报业智慧研究中心(深圳)有限公司 Smart city information interaction method and system based on frequency division control

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107977735A (en) * 2017-11-15 2018-05-01 河海大学 A kind of municipal daily water consumption Forecasting Methodology based on deep learning
CN111597971A (en) * 2020-05-14 2020-08-28 北京交通大学 Method for predicting short-term arrival passenger flow of urban rail transit
CN111626518A (en) * 2020-05-29 2020-09-04 上海交通大学 Urban daily water demand online prediction method based on deep learning neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107977735A (en) * 2017-11-15 2018-05-01 河海大学 A kind of municipal daily water consumption Forecasting Methodology based on deep learning
CN111597971A (en) * 2020-05-14 2020-08-28 北京交通大学 Method for predicting short-term arrival passenger flow of urban rail transit
CN111626518A (en) * 2020-05-29 2020-09-04 上海交通大学 Urban daily water demand online prediction method based on deep learning neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘志壮: "基于小波组合模型的短期城市用水量预测", 《计算机技术》, vol. 46, no. 10, 10 October 2020 (2020-10-10), pages 110 - 114 *
鲁凤 等: "合肥市需水预测遗传小波神经网络模型研究", 《测绘科学》, vol. 38, no. 5, 22 February 2013 (2013-02-22), pages 28 - 31 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117609812A (en) * 2023-12-07 2024-02-27 中科报业智慧研究中心(深圳)有限公司 Smart city information interaction method and system based on frequency division control

Similar Documents

Publication Publication Date Title
Grabmeier et al. Techniques of cluster algorithms in data mining
CN110659207B (en) Heterogeneous cross-project software defect prediction method based on nuclear spectrum mapping migration integration
JP2002543538A (en) A method of distributed hierarchical evolutionary modeling and visualization of experimental data
CN112990386B (en) User value clustering method and device, computer equipment and storage medium
CN113869052B (en) AI-based house address matching method, storage medium and equipment
CN111339157B (en) Method, system and equipment for calculating and predicting daily operation efficiency of power distribution network
Hancock et al. Impact of hyperparameter tuning in classifying highly imbalanced big data
CN112330052A (en) Distribution transformer load prediction method
CN116187835A (en) Data-driven-based method and system for estimating theoretical line loss interval of transformer area
CN114154716B (en) Enterprise energy consumption prediction method and device based on graph neural network
CN114862618A (en) Artificial intelligence-based urban water consumption prediction method, device, equipment and medium
Brahimi et al. Modelling on Car‐Sharing Serial Prediction Based on Machine Learning and Deep Learning
CN113450141B (en) Intelligent prediction method and device based on electricity sales quantity characteristics of large power customer group
CN115034315A (en) Business processing method and device based on artificial intelligence, computer equipment and medium
US20220121999A1 (en) Federated ensemble learning from decentralized data with incremental and decremental updates
CN116861373A (en) Query selectivity estimation method, system, terminal equipment and storage medium
CN114116528A (en) Memory access address prediction method and device, storage medium and electronic equipment
CN113821401A (en) WT-GA-GRU model-based cloud server fault diagnosis method
Fang et al. An attention-based deep learning model for multi-horizon time series forecasting by considering periodic characteristic
Mai et al. Choices are not independent: Stackelberg security games with nested quantal response models
Singh et al. A feature extraction and time warping based neural expansion architecture for cloud resource usage forecasting
Yaghoubi et al. Model-based clustering (MBC) for road data via multivariate mixture of normal distributions and factor analysis (FA)
Dwarakanath et al. Optimal Stopping with Gaussian Processes
CN114519307A (en) Information prediction method, device, system, storage medium and electronic equipment
Deng et al. A hybrid multi-scale fusion paradigm for AQI prediction based on the secondary decomposition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination