CN117370759A - Gas consumption prediction method based on artificial intelligence - Google Patents
Gas consumption prediction method based on artificial intelligence Download PDFInfo
- Publication number
- CN117370759A CN117370759A CN202311431632.9A CN202311431632A CN117370759A CN 117370759 A CN117370759 A CN 117370759A CN 202311431632 A CN202311431632 A CN 202311431632A CN 117370759 A CN117370759 A CN 117370759A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- prediction
- seasonal
- gas consumption
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 17
- 239000007789 gas Substances 0.000 claims abstract description 38
- 239000002737 fuel gas Substances 0.000 claims abstract description 36
- 238000007781 pre-processing Methods 0.000 claims abstract description 31
- 238000011156 evaluation Methods 0.000 claims abstract description 28
- YHXISWVBGDMDLQ-UHFFFAOYSA-N moclobemide Chemical compound C1=CC(Cl)=CC=C1C(=O)NCCN1CCOCC1 YHXISWVBGDMDLQ-UHFFFAOYSA-N 0.000 claims abstract description 25
- 238000010438 heat treatment Methods 0.000 claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 22
- 238000012545 processing Methods 0.000 claims abstract description 19
- 238000013480 data collection Methods 0.000 claims abstract description 13
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 6
- 238000010276 construction Methods 0.000 claims abstract description 5
- 230000001932 seasonal effect Effects 0.000 claims description 44
- 230000001502 supplementing effect Effects 0.000 claims description 10
- 238000012216 screening Methods 0.000 claims description 8
- 238000007619 statistical method Methods 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 2
- 230000010354 integration Effects 0.000 abstract description 3
- 238000005311 autocorrelation function Methods 0.000 description 9
- 230000009286 beneficial effect Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 241000728173 Sarima Species 0.000 description 6
- 230000008859 change Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000003648 Ljung–Box test Methods 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06315—Needs-based resource requirements planning or analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Evolutionary Computation (AREA)
- General Business, Economics & Management (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Tourism & Hospitality (AREA)
- Life Sciences & Earth Sciences (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of gas management, in particular to a gas consumption prediction method based on artificial intelligence. The method mainly comprises five steps: data collection, data preprocessing, prediction model construction and training, model evaluation and result prediction. Firstly, collecting historical gas consumption data through a data acquisition system; the collected data is then preprocessed using a data processing tool, including removing redundancy and processing imperfections, to generate a data file for use in subsequent steps. Then, according to ARIMA algorithm, a prediction model is respectively built for heating users and non-heating users, and training is carried out. In the model evaluation step, the model is evaluated according to the evaluation index, so that the prediction performance of the model is ensured. And finally, carrying out model integration and predicting the fuel gas consumption of the user. The invention aims to provide a method capable of accurately predicting the future fuel gas consumption of a user so as to meet the requirements of modern society on fine energy management.
Description
Technical Field
The invention relates to the technical field of gas management, in particular to a gas consumption prediction method based on artificial intelligence.
Background
In modern society, fuel gas plays an important role in daily life and industrial production. With the development of city and the progress of industrialization, the demand for fuel gas is gradually increasing. However, the supply and demand of gas is often affected by many factors, such as seasonal variations, climatic conditions, price fluctuations, and consumer behavior. Therefore, accurate prediction of the fuel gas consumption is of great importance to fuel gas suppliers, and can help them to make more effective supply strategies, reduce operation cost and improve service quality.
Currently, the prediction of the fuel gas consumption mainly depends on statistical methods and empirical judgment, such as a historical average method, a moving average method, an exponential smoothing method and the like. However, these traditional prediction methods often do not accurately reflect the complexity and dynamics of the fuel gas usage. For example, the fuel gas consumption often has obvious seasonal and trending changes, and the conventional prediction method often cannot fully grasp the change rules. In addition, the fuel gas usage may be affected by a number of unpredictable factors, such as sudden climate change, price fluctuations, and policy adjustments.
In general, existing fuel gas consumption prediction techniques have certain limitations, and often cannot meet increasing prediction requirements. In particular, current prediction techniques often fail to accurately predict future fuel gas usage, which may lead to oversupply or undersupply for fuel gas suppliers, thereby increasing operating costs and reducing quality of service. Therefore, it is necessary to develop a new technique capable of accurately predicting the amount of fuel gas.
Disclosure of Invention
The invention aims to provide an artificial intelligence-based fuel gas consumption prediction method, which can accurately predict the future fuel gas consumption of a user.
The basic scheme provided by the invention is as follows: the artificial intelligence-based gas consumption prediction method comprises the following steps: the method comprises a data collection step, a data preprocessing step, a prediction model construction and training step, a model evaluation step and a result prediction step; the data collection step is used for collecting historical gas consumption data through a data collection system; the data preprocessing step uses a data processing tool to preprocess the acquired data, wherein the preprocessing comprises removing redundancy of the data and processing the incompleteness of the data so as to generate a data file for the subsequent step; in the step of constructing and training the prediction model, the prediction model is respectively built for heating users and non-heating users based on an ARIMA algorithm, and the model is trained; the model evaluation step evaluates the model according to the evaluation index; the result prediction step includes integrating the model and predicting the fuel gas consumption of the user.
The invention has the realization principle and beneficial effects that:
the gas consumption prediction method based on artificial intelligence provided by the invention mainly depends on five key steps in the implementation principle. Firstly, the data acquisition system collects historical gas consumption data and provides original input for subsequent steps. The data processing tool then pre-processes the data, including removing redundancy and processing imperfections, to generate a data file for use in subsequent steps, ensuring the quality and validity of the data. And then, according to an ARIMA algorithm, respectively establishing a prediction model for heating users and non-heating users, and training, so that the model can learn the gas consumption rules of various users. In the model evaluation step, the model is evaluated according to the evaluation index, so that the prediction performance of the model is ensured. And finally, integrating the models, predicting the fuel gas consumption of the user, and ensuring the stability and accuracy of the prediction result.
The beneficial effects of the invention are mainly shown in the following steps: firstly, through data preprocessing, the quality of data is ensured, so that the prediction accuracy is improved. And secondly, different prediction models are built for different types of users according to an ARIMA algorithm, so that the prediction is finer, and the requirements of various users are met. And thirdly, through model evaluation, the model can be continuously optimized, and the prediction performance is improved. Finally, a more stable and accurate prediction result is provided through model integration, so that refined energy management and scheduling are realized, and the requirement of the modern society on energy management is met.
Further, the data collection step collects historical data through a hadoop platform.
The beneficial effect of this scheme is: hadoop is used as a high-efficiency big data processing tool and can process and store massive data, so that more comprehensive and richer historical gas consumption data can be collected, and the prediction accuracy is improved. And secondly, the Hadoop platform provides distributed processing capacity, and can process data on a plurality of computers in parallel, so that the data processing efficiency is greatly improved, and the data collection and preprocessing time is shortened. Furthermore, hadoop has fault tolerance and expandability, and can ensure the stability and the safety of data even when processing large-scale data. Therefore, the Hadoop platform is used for collecting historical data, so that the accuracy and efficiency of prediction can be improved, the stability and safety of the data can be ensured, and the requirements of modern society on big data processing are met.
Further, in the data preprocessing step, preprocessing the acquired data by using an HIVE tool; the data preprocessing further comprises the steps of adjusting meter reading dates, supplementing the missing month gas consumption and screening data according to a specific rule;
wherein, the adjustment to the meter reading date includes: the actual meter reading time in the meter reading record is 1 day to 20 days of a certain month, and the actual meter reading time is calculated as the data of the current month; the actual meter reading time in the meter reading record is 20 days after a certain month, and the actual meter reading time is calculated as the data of the next month;
supplementing the missing month gas usage includes: supplementing the missing monthly air consumption in an average value mode;
the rules of data screening include: reserving data with gas consumption of 0 in the sample data of the last 1 month; data with gas consumption of not 0 in 18 or more sample data are reserved; in addition: if the user number corresponds to a non-heating user in the sample data, reserving the data with the variation coefficient of the air consumption smaller than 1; if the user number corresponds to the heating user in the sample data, the data with the variation coefficient of the air consumption smaller than 2.5 is reserved.
The beneficial effect of this scheme is: in the data preprocessing step, the HIVE tool is adopted to preprocess the acquired data, and the method further comprises the steps of adjusting the meter reading date, supplementing the missing month gas consumption and screening the data according to a specific rule, so that the improvement brings remarkable beneficial effects.
Firstly, the HIVE is used for preprocessing, so that large-scale data can be efficiently processed in a distributed environment, and the speed and efficiency of data processing are improved. And secondly, the actual meter reading time can be standardized through adjusting the meter reading date, so that the data are more consistent, the comparability of the data is enhanced, and the accuracy of a prediction model is improved. Moreover, the air consumption of the missing month is supplemented in an average value mode, so that the blank of the data is filled, the integrity of the data is ensured, and the prediction capability of the model is improved.
In addition, the setting of the data screening rules comprises the steps of reserving data with the gas consumption in the sample data of more than 18 samples and more than 1 month recently and reserving data with the gas consumption variation coefficient within a certain range according to the user type, wherein the rules can effectively remove abnormal data and invalid data and improve the quality of the data, so that the stability and the accuracy of a prediction model are improved.
Therefore, the scheme not only improves the efficiency of data processing and the quality of data, but also improves the accuracy and stability of the prediction model, and is beneficial to realizing more accurate gas consumption prediction.
Further, the evaluation index in the model evaluation step includes an average absolute percentage error, a decision coefficient, an average absolute error, and a mean square error, and the prediction performance of the model is evaluated based on these indexes.
Advantageous effects of the present solutionThe effect is as follows: in the model evaluation step, the invention adopts average absolute percent error (MAPE), a determination coefficient (R 2 ) The Mean Absolute Error (MAE) and the Mean Square Error (MSE) are taken as evaluation indexes, and the indexes can comprehensively evaluate the prediction performance of the model, thereby bringing about remarkable beneficial effects. Firstly, MAPE can measure the average level of relative errors between a predicted value and an actual value, is a commonly used prediction precision evaluation index, and is helpful for intuitively knowing the prediction precision of a model. Next, R 2 Also called decision coefficients, can represent the degree of correlation of the model predicted value with the actual value, generally R 2 The closer to 1, the better the predictive performance of the model is explained. Furthermore, both MAE and MSE are common indicators of model prediction errors, where MAE reflects the average degree of deviation of the predicted value from the actual value, and MSE focuses more on the deviation of the predicted value from the actual value. Through the four evaluation indexes, the prediction performance of the model, including various aspects such as prediction precision, prediction deviation and the like, can be comprehensively known, the deficiency of the model can be found out, the optimization and improvement of the model can be guided, and the accuracy of the gas consumption prediction can be improved.
Further, the model training step includes: step one: loading the collected data and carrying out descriptive statistical analysis; step two: performing white noise test on the data, and if the data is white noise, terminating subsequent analysis; step three: selecting different preprocessing modes according to whether the data has seasonality, if so, decomposing factors, and if not, directly performing the next step; step four: preprocessing the data, judging whether the data is stable or not, and if not, performing differential operation until the data is stable; step five: calculating the autocorrelation and bias correlation of the data, and determining ARIMA model parameters; step six: establishing an ARIMA model according to the determined model parameters, and training the model; step seven: and (3) diagnosing the model, checking whether the residual error of the model meets white noise, and if the residual error of the model does not meet the white noise, returning to the step (V), and re-determining the model parameters.
The beneficial effect of this scheme is: firstly, through descriptive statistical analysis and white noise inspection, the property and quality of the data can be primarily known, invalid white noise data can be recognized early, unnecessary calculation and analysis are avoided, and the model training efficiency is improved. And secondly, different preprocessing modes are selected according to whether the data have seasonality, which is particularly important in processing the data with seasonality gas consumption, so that the seasonality data can be effectively processed and analyzed, and the prediction accuracy of the model is improved. Furthermore, the data is stabilized through preprocessing and stability judgment of the data and differential operation, and the steps are helpful for ensuring the quality and consistency of the data, so that a good basis is provided for subsequent model training. Then, the ARIMA model parameters are determined by calculating the autocorrelation and the bias correlation of the data, which is the key of the ARIMA model, so that a model which accurately reflects the data characteristics is established, and the prediction accuracy is improved. Finally, the effectiveness and the accuracy of the model can be ensured by diagnosing the model and checking white noise, and if the residual error of the model does not meet the white noise, the model parameters are redetermined, so that the model is optimized and perfected, and the stability and the prediction performance of the model are improved.
Further, the model parameters include: number of prediction cycles: for determining a predicted number of future time periods; autoregressive term p: defining a time step number for prediction based on a historical value of the data; number of differences d: for determining the number of differencing required to smooth the data sequence; moving average term number q: for defining the number of error terms considered in the model; seasonal autoregressive order P: defining an order of seasonal components in the AR model; seasonal differential order D: for determining the number of differencing required to smooth the seasonal data sequence; seasonal moving average order Q: defining an order of seasonal components in the MA model; number of time steps during a single season S: for defining the length of the seasonal period.
Wherein the number of prediction cycles is 4; the autoregressive term p is 3; the difference times d is 1; the number q of the moving average terms is 2; the seasonal autoregressive order P is 0; the seasonal differential order D is 1; the seasonal moving average order Q is 0; the number of time steps during the single season S is 4.
The beneficial effect of this scheme is: firstly, clear parameter definition and setting are realized, so that the model construction is clearer and more visual, the prediction result of the model is helpful to be understood and interpreted, and the interpretability of the model is improved. Second, the setting of parameters reflects deep understanding and consideration of data characteristics, such as stabilizing the data by differential operations, capturing autocorrelation and trend changes of the data by autoregressive and moving average models, and processing seasonal influences by seasonal parameters, which all contribute to improving the prediction accuracy of the models. Furthermore, the number of prediction cycles is 4, which means that the model can predict the fuel gas consumption for four time periods (e.g., four months) in the future, which is helpful for longer-term planning and decision-making. Finally, through fine setting of model parameters, various complex and variable data conditions can be better adapted and processed, and the adaptability of the model is improved.
Drawings
FIG. 1 is a flow chart of an artificial intelligence based fuel gas usage prediction method in the present invention.
Detailed Description
The following is a further detailed description of the embodiments:
example 1
The artificial intelligence-based fuel gas consumption prediction method shown in fig. 1 comprises the following steps: the method comprises a data collection step, a data preprocessing step, a prediction model construction and training step, a model evaluation step and a result prediction step.
Assume that a large urban gas company wishes to predict gas usage through artificial intelligence techniques. The company serves tens of thousands of household and business users, covering heating users and non-heating users.
And the data collection step is used for collecting historical gas consumption data through a data collection system. In this embodiment, historical data is collected through the hadoop platform, and meter reading records of each user in the past 36 months are extracted, wherein the records comprise user IDs, meter reading dates and corresponding gas consumption.
The data preprocessing step uses a data processing tool to preprocess the collected data, the preprocessing including removing redundancy of the data, processing the data for incompleteness to generate a data file for use in a subsequent step. In this embodiment, the data preprocessing step uses an HIVE tool to preprocess the collected data, where the HIVE tool is a Hadoop-based data warehouse tool that can be used to store, query and analyze data. The data preprocessing further comprises the steps of adjusting meter reading dates, supplementing the missing month gas consumption and screening data according to a specific rule; wherein, the adjustment to the meter reading date includes: the actual meter reading time in the meter reading record is 1 day to 20 days of a certain month, and the actual meter reading time is calculated as the data of the current month; the actual meter reading time in the meter reading record is 20 days after a certain month, and the actual meter reading time is calculated as the data of the next month; supplementing the missing month gas usage includes: supplementing the missing monthly air consumption in an average value mode; the rules of data screening include: reserving data with gas consumption of 0 in the sample data of the last 1 month; data with gas consumption of not 0 in 18 or more sample data are reserved; in addition: if the user number corresponds to a non-heating user in the sample data, reserving the data with the variation coefficient of the air consumption smaller than 1; if the user number corresponds to the heating user in the sample data, the data with the variation coefficient of the air consumption smaller than 2.5 is reserved.
In the step of constructing and training the prediction model, the prediction model is respectively built for the heating user and the non-heating user based on an ARIMA algorithm, and the model is trained. ARIMA model is often used for prediction of time series data, including seasonal and non-seasonal data.
In this embodiment, the model training step includes:
step one: and loading the collected data, and carrying out descriptive statistical analysis, including calculating statistics of mean, median, quantile, standard deviation and the like, so as to know the distribution and variation of the data. Meanwhile, a timing chart, an Autocorrelation Chart (ACF) and a Partial Autocorrelation Chart (PACF) of the data are generated, so that whether the data have autocorrelation or not is primarily judged.
Step two: the data is checked for white noise, i.e. whether the individual items in the data are independent of each other and have the same distribution. If the data is white noise, the subsequent analysis is terminated and an Ljung-Box test may be used for white noise verification.
Step three: and selecting different preprocessing modes according to whether the data has seasonality, if so, decomposing factors, and if not, directly carrying out the next step. Seasonal refers to a pattern in which data appears to be apparent during a particular time period in the year. For example, the amount of fuel gas may increase during winter season because people need to warm with fuel gas. Whether there is seasonality can be preliminarily judged by observing the time chart of the data.
Step four: and preprocessing the data, judging whether the data is stable or not, and if the data is unstable, performing differential operation until the data is stable. Before establishing the ARIMA model, it is necessary to check whether the data is stable. Stability refers to the mean and variance of the data remaining unchanged over time. Whether the data is stable or not can be preliminarily judged by observing the timing chart of the data. If the data is unstable, a differential operation is required to stabilize the data. The difference refers to the difference between two adjacent periods in the calculated data. First order differencing (i.e., calculating the difference between adjacent phases) can be performed, second order differencing (i.e., calculating the difference between first order differences) if the data is not yet stable, and so on until the data is stable.
Step five: and calculating the autocorrelation and the partial correlation of the data, and determining ARIMA model parameters.
Step six: and establishing an ARIMA model according to the determined model parameters, and training the model.
In this embodiment, the ARIMA model parameters include:
number of prediction cycles: here set to 4, which means that this model will be used to predict data for 4 time periods in the future.
Autoregressive term p: here set to 3, which means that the data values of the first 3 time periods are used in the model as prediction variables. The choice of this parameter is typically based on a partial autocorrelation function diagram (PACF), in this example the partial autocorrelation coefficients are significantly non-zero at 3 rd order and truncated after 3 rd order, so p=3 is chosen.
Number of differences d: here set to 1, the indicating data is differentiated once to extract trend information and smooth the sequence. Smoothness is an important premise for applying the SARIMA model.
Moving average term number q: here set to 2, indicating that the prediction error of the first 2 time periods is used in the model as a prediction variable. The selection of this parameter is typically based on an autocorrelation function diagram (ACF), which in this example exhibits tailing properties, so q=2 is selected.
Seasonal autoregressive order P: set to 0 here, indicating that no seasonal autoregressive term is used. In the addition season model, p=0 is often set.
Seasonal differential order D: here set to 1, indicating that the data is differentiated once per seasonal period to eliminate seasonality and smooth the sequence.
Seasonal moving average order Q: here set to 0, indicating that the seasonal moving average term is not used. In the addition season model, q=0 is often set.
Number of time steps during a single season S: here set to 4, meaning that each season contains 4 time steps. This is the length of the seasonal period, e.g., if one season is one year and the data is quarter data, s=4; if the data is month data, s=12.
The choice of parameters for ARIMA and SARIMA models is an important step, as these parameters directly affect the predictive performance of the model. The selection of parameters typically requires a combination of statistical theory and actual data.
In determining the ARIMA model parameters (p, d, q), reference is typically made to an auto-correlation function (ACF) map and a partial auto-correlation function (PACF) map. The ACF map may help determine the MA order q and the PACF map may help determine the AR order p. Meanwhile, the difference number d needs to be determined, so that the data after difference becomes stable. In practice, multiple sets of parameters are typically selected for trial and then the optimal combination of parameters is selected. The optimal combination of parameters can be determined by comparing the AIC (red pool information criterion) or BIC (bayesian information criterion) of different models, the smaller the AIC and BIC, the better the predictive performance of the model.
When using the SARIMA model, it is necessary to determine seasonal parameters (P, D, Q, S) in addition to parameters (P, D, Q) of the ARIMA model. The selection of these parameters also needs to be done in combination with statistical theory and actual data. The selection of seasonal parameters (P, D, Q) may refer to seasonal ACF maps and seasonal PACF maps, the selection of S generally being dependent on the seasonal period of the data.
With these parameters, an ARIMA or SARIMA model can be built and trained using historical data. In the training process, the convergence process of the model needs to be monitored, and super parameters such as learning rate and the like are adjusted according to the requirement. After model training is completed, the latest data can be used for verification, and the prediction capability of the model is checked.
The ARIMA model is used to process non-seasonal data, while the SARIMA model is an extension of the ARIMA model to process seasonal data. The energy demand patterns of heating users and non-heating users may vary greatly, particularly in terms of seasonal impact. For example, the energy demand of heating users may increase significantly during winter, while the energy demand of non-heating users may not have such seasonal pattern. For heating users, it may be desirable to use the SARIMA model to handle the seasonality of the data. Whereas for non-heating users, the ARIMA model may be used if there is no significant seasonal.
Step seven: and (3) diagnosing the model, checking whether the residual error of the model meets white noise, and if the residual error of the model does not meet the white noise, returning to the step (V), and re-determining the model parameters.
White noise is an ideal random signal that satisfies several conditions: its expected value is 0, i.e. all its values fluctuate around 0; its variance is constant, i.e. the fluctuation amplitude of all its values does not change with time; any two of its values are uncorrelated, i.e. its current value cannot be predicted with past values.
If the residual of the model satisfies the white noise, then the model can be considered to have captured all the information in the data, leaving only random noise, and no longer being predicted by the model. If the residual of the model does not meet the white noise, it is stated that the model may not fully capture all the information in the data, requiring a redetermined model parameters for optimization. This is an iterative optimization process and the model cannot be considered suitable until the residual of the model meets the white noise.
And the model evaluation step evaluates the model according to the evaluation index. In this embodiment, the evaluation indexes in the model evaluation step include an average absolute percentage error, a decision coefficient, an average absolute error, and a mean square error, and the prediction performance of the model is evaluated according to these indexes. Wherein MAPE (Mean Absolute Percentage Error) mean absolute percent error is a commonly used prediction accuracy measure that represents the average of the percent error between the predicted value and the actual value. This value is less than 50% in this embodiment to indicate that the relative error of the model predictions is within acceptable limits. The method comprises the steps of carrying out a first treatment on the surface of the The index R2 Score (decision coefficient) represents the percentage of the variability performance of the variables predicted by the model to be interpreted. This value is equal to or greater than 0.5 in this example to indicate that the model can account for at least 50% variability; mean Absolute Error (MAE) and Mean Square Error (MSE): these two indices represent the average of the absolute and squared errors between the predicted and actual values, the smaller the two values are in this embodiment the better to represent that the absolute and squared errors of the model predictions are within an acceptable range.
If the model evaluation meets the above criteria, the training is complete. If the model evaluation does not meet the above criteria, the extraction and preprocessing of meter reading data is conducted again, and the model is retrained.
The result prediction step includes integrating the model and predicting the fuel gas consumption of the user. After model training and evaluation is completed, the model can be used to predict fuel gas usage. New or future data may be used to input into the model, which will output the predicted fuel gas usage. In order to improve the accuracy of the prediction, a method using model integration may be considered. For example, multiple ARIMA models may be trained, each model using slightly different parameters or training data, and then the predictions of these models combined to obtain the final prediction. And finally, feeding back the predicted result to the gas company. These predictions can help companies to learn about possible gas usage changes in advance to better manage and allocate resources. For example, if the prediction shows that there may be a significant increase in fuel gas usage in the next several months, the company may increase fuel gas supply in advance to avoid under-supply situations.
The foregoing is merely exemplary of the present invention, and the specific structures and features well known in the art are not described in any way herein, so that those skilled in the art will be able to ascertain all prior art in the field, and will not be able to ascertain any prior art to which this invention pertains, without the general knowledge of the skilled person in the field, before the application date or the priority date, to practice the present invention, with the ability of these skilled persons to perfect and practice this invention, with the help of the teachings of this application, with some typical known structures or methods not being the obstacle to the practice of this application by those skilled in the art. It should be noted that modifications and improvements can be made by those skilled in the art without departing from the structure of the present invention, and these should also be considered as the scope of the present invention, which does not affect the effect of the implementation of the present invention and the utility of the patent. The protection scope of the present application shall be subject to the content of the claims, and the description of the specific embodiments and the like in the specification can be used for explaining the content of the claims.
Claims (7)
1. The artificial intelligence-based fuel gas consumption prediction method is characterized by comprising the following steps of: the method comprises a data collection step, a data preprocessing step, a prediction model construction and training step, a model evaluation step and a result prediction step;
the data collection step is used for collecting historical gas consumption data through a data collection system;
the data preprocessing step uses a data processing tool to preprocess the acquired data, wherein the preprocessing comprises removing redundancy of the data and processing the incompleteness of the data so as to generate a data file for the subsequent step;
in the step of constructing and training the prediction model, the prediction model is respectively built for heating users and non-heating users based on an ARIMA algorithm, and the model is trained;
the model evaluation step evaluates the model according to the evaluation index;
the result prediction step includes integrating the model and predicting the fuel gas consumption of the user.
2. The artificial intelligence based fuel gas usage prediction method according to claim 1, wherein: and the data collection step collects historical data through a hadoop platform.
3. The artificial intelligence based fuel gas usage prediction method according to claim 1, wherein: in the data preprocessing step, preprocessing the acquired data by using an HIVE tool; the data preprocessing further comprises the steps of adjusting meter reading dates, supplementing the missing month gas consumption and screening data according to a specific rule;
wherein, the adjustment to the meter reading date includes: the actual meter reading time in the meter reading record is 1 day to 20 days of a certain month, and the actual meter reading time is calculated as the data of the current month; the actual meter reading time in the meter reading record is 20 days after a certain month, and the actual meter reading time is calculated as the data of the next month;
supplementing the missing month gas usage includes: supplementing the missing monthly air consumption in an average value mode;
the rules of data screening include: reserving data with gas consumption of 0 in the sample data of the last 1 month; data with gas consumption of not 0 in 18 or more sample data are reserved; in addition: if the user number corresponds to a non-heating user in the sample data, reserving the data with the variation coefficient of the air consumption smaller than 1; if the user number corresponds to the heating user in the sample data, the data with the variation coefficient of the air consumption smaller than 2.5 is reserved.
4. The artificial intelligence based fuel gas usage prediction method according to claim 1, wherein: the evaluation indexes in the model evaluation step comprise average absolute percentage error, decision coefficient, average absolute error and mean square error, and the prediction performance of the model is evaluated according to the indexes.
5. The artificial intelligence based fuel gas usage prediction method according to claim 1, wherein: the model training step comprises the following steps:
step one: loading the collected data and carrying out descriptive statistical analysis;
step two: performing white noise test on the data, and if the data is white noise, terminating subsequent analysis;
step three: selecting different preprocessing modes according to whether the data has seasonality, if so, decomposing factors, and if not, directly performing the next step;
step four: preprocessing the data, judging whether the data is stable or not, and if not, performing differential operation until the data is stable;
step five: calculating the autocorrelation and bias correlation of the data, and determining ARIMA model parameters;
step six: establishing an ARIMA model according to the determined model parameters, and training the model;
step seven: and (3) diagnosing the model, checking whether the residual error of the model meets white noise, and if the residual error of the model does not meet the white noise, returning to the step (V), and re-determining the model parameters.
6. The artificial intelligence based fuel gas usage prediction method according to claim 5, wherein: the model parameters include: number of prediction cycles: for determining a predicted number of future time periods; autoregressive term p: defining a time step number for prediction based on a historical value of the data; number of differences d: for determining the number of differencing required to smooth the data sequence; moving average term number q: for defining the number of error terms considered in the model; seasonal autoregressive order P: defining an order of seasonal components in the AR model; seasonal differential order D: for determining the number of differencing required to smooth the seasonal data sequence; seasonal moving average order Q: defining an order of seasonal components in the MA model; number of time steps during a single season S: for defining the length of the seasonal period.
7. The artificial intelligence based fuel gas usage prediction method according to claim 6, wherein: the predicted cycle number is 4; the autoregressive term p is 3; the difference times d is 1; the number q of the moving average terms is 2; the seasonal autoregressive order P is 0; the seasonal differential order D is 1; the seasonal moving average order Q is 0; the number of time steps during the single season S is 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311431632.9A CN117370759A (en) | 2023-10-31 | 2023-10-31 | Gas consumption prediction method based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311431632.9A CN117370759A (en) | 2023-10-31 | 2023-10-31 | Gas consumption prediction method based on artificial intelligence |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117370759A true CN117370759A (en) | 2024-01-09 |
Family
ID=89400166
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311431632.9A Pending CN117370759A (en) | 2023-10-31 | 2023-10-31 | Gas consumption prediction method based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117370759A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118014145A (en) * | 2024-02-21 | 2024-05-10 | 山东智慧燃气物联网技术有限公司 | Gas consumption prediction method and device, electronic equipment and storage medium |
-
2023
- 2023-10-31 CN CN202311431632.9A patent/CN117370759A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118014145A (en) * | 2024-02-21 | 2024-05-10 | 山东智慧燃气物联网技术有限公司 | Gas consumption prediction method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022135265A1 (en) | Failure warning and analysis method for reservoir dispatching rules under effects of climate change | |
CN105117810A (en) | Residential electricity consumption mid-term load prediction method under multistep electricity price mechanism | |
CN112801388B (en) | Power load prediction method and system based on nonlinear time series algorithm | |
CN108416466A (en) | Methods of electric load forecasting, the computer information processing system of complex characteristics influence | |
CN110852476A (en) | Passenger flow prediction method and device, computer equipment and storage medium | |
CN108415884B (en) | Real-time tracking method for structural modal parameters | |
CN117370759A (en) | Gas consumption prediction method based on artificial intelligence | |
CN114048436A (en) | Construction method and construction device for forecasting enterprise financial data model | |
CN114154716B (en) | Enterprise energy consumption prediction method and device based on graph neural network | |
CN115470962A (en) | LightGBM-based enterprise confidence loss risk prediction model construction method | |
CN114169254A (en) | Abnormal energy consumption diagnosis method and system based on short-term building energy consumption prediction model | |
CN108830405B (en) | Real-time power load prediction system and method based on multi-index dynamic matching | |
CN112633556A (en) | Short-term power load prediction method based on hybrid model | |
CN112417627A (en) | Power distribution network operation reliability analysis method based on four-dimensional index system | |
CN107679666A (en) | A kind of power consumption prediction method based on Shapley values and economic development | |
CN114548494A (en) | Visual cost data prediction intelligent analysis system | |
CN110414776B (en) | Quick response analysis system for power utilization characteristics of different industries | |
CN108665090B (en) | Urban power grid saturation load prediction method based on principal component analysis and Verhulst model | |
CN115511230A (en) | Electric energy substitution potential analysis and prediction method | |
CN108615091A (en) | Electric power meteorology load data prediction technique based on cluster screening and neural network | |
CN111797924B (en) | Three-dimensional garden portrait method and system based on clustering algorithm | |
CN111143774B (en) | Power load prediction method and device based on influence factor multi-state model | |
CN111144682A (en) | Method for mining main influence factors of operation efficiency of power distribution network | |
CN117236532B (en) | Load data-based electricity consumption peak load prediction method and system | |
CN118134295B (en) | Demand response user credit evaluation method, system, storage medium and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |