CN114328444A - Gas data modeling method and system based on dimension analysis and electronic equipment - Google Patents

Gas data modeling method and system based on dimension analysis and electronic equipment Download PDF

Info

Publication number
CN114328444A
CN114328444A CN202111323216.8A CN202111323216A CN114328444A CN 114328444 A CN114328444 A CN 114328444A CN 202111323216 A CN202111323216 A CN 202111323216A CN 114328444 A CN114328444 A CN 114328444A
Authority
CN
China
Prior art keywords
data
analysis
dimension
gas
acquired
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111323216.8A
Other languages
Chinese (zh)
Inventor
韦明
胡移山
刘恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Jinshao Intelligent System Co ltd
Original Assignee
Guangzhou Jinshao Intelligent System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Jinshao Intelligent System Co ltd filed Critical Guangzhou Jinshao Intelligent System Co ltd
Priority to CN202111323216.8A priority Critical patent/CN114328444A/en
Publication of CN114328444A publication Critical patent/CN114328444A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Measuring Volume Flow (AREA)

Abstract

The application relates to the technical field of big data, in particular to a gas data modeling method and system based on dimension analysis and electronic equipment. The method comprises the following steps: cleaning the acquired data through first repeated data, processing error data, and supplementing default data; performing dimension analysis on the acquired data subjected to the first repeated data cleaning, and storing the acquired data in an intermediate database; and carrying out gas data modeling based on the acquired data in the intermediate database. The gas data analysis method and the gas data analysis system can solve the technical problems that the existing gas data analysis speed is low, the analysis dimension is not flexible, and the analysis accuracy is improved.

Description

Gas data modeling method and system based on dimension analysis and electronic equipment
Technical Field
The application relates to the technical field of big data, in particular to a gas data modeling method and system based on dimension analysis and electronic equipment.
Background
In the big data era, the analysis of user behaviors and the like is carried out on data, important references can be provided for the operation basis, the development direction and even the decision of a gas company, and the large gas consumption data volume of the gas industry is a natural rich ore to be mined. Aiming at the gas consumption habit of a specific user, analyzing the gas consumption behavior in the aspects of every day, every time period and the like and displaying the gas consumption behavior in real time through a large screen, a gas company can better serve customers to find the gas consumption problem, but the traditional data analysis mode cannot meet the requirement of accurate statistics.
The gas meter acquisition data of an industrial user is instantaneous, only the flow and the gas consumption at a certain time point are acquired, the acquisition frequency of each gas meter, even to each gas meter, is possibly inconsistent, and the acquired data is not uploaded to a server along with the reasons of failure of an acquisition device of the gas meter and the like, so that the acquired data is lost; the change of the meter reading brought by the replacement of the gas meter is not repaired, and a plurality of difficulties are brought to the analysis of the traditional collected data. And because the data volume of the collected data is huge, the analysis efficiency of the gas data is not high.
With the arrival of the big data era, the traditional hand-written data analysis and the simple daily gas analysis of a single gas meter cannot meet the requirement of enterprises on gas use data, more detailed analysis requirements of dimensions such as every hour, every day, even quarterly, year and the like of the gas meter are provided with schedules, and even the gas use behaviors of users after the gas use behaviors are predicted through the data. This requires a more detailed planning and analysis of the gas usage data and a model of the user's gas usage based thereon.
The conventional data analysis mode of the existing gas company has no uniform analysis rule, and only the data in a certain period of time are simply added or subtracted according to the requirements of users, and then statistical data are generated into a large display screen.
In the process of implementing the present application, the inventor finds that at least the following problems exist in the prior art: 1. the data collection quantity is huge, and the analysis speed is slow; 2. the analysis dimension is not clear and flexible enough; 3. the analysis accuracy is low.
Disclosure of Invention
Therefore, the embodiment of the application provides a gas data modeling method and system based on dimension analysis and an electronic device, which can solve the technical problems of low gas data analysis speed, inflexible analysis dimension and low analysis accuracy rate in the prior art, and the specific technical scheme comprises the following contents:
in a first aspect, an embodiment of the present application provides a gas data modeling method based on dimensional analysis, the method including: cleaning the acquired data through first repeated data, processing error data, and supplementing default data;
performing dimension analysis on the acquired data subjected to the first repeated data cleaning, and storing the acquired data in an intermediate database;
and carrying out gas data modeling based on the acquired data in the intermediate database.
By adopting the technical scheme, the intermediate database is established, the collected data is cleaned, invalid data in the data are effectively reduced, error data are cleaned, default data are supplemented and then stored in the intermediate database, interference of unnecessary data during modeling is reduced, modeling data are more accurate, and the data volume of original modeling data is reduced. The dimension of the processed collected data is divided, so that the data can be conveniently accessed, checked and analyzed, the data confusion is reduced, the related data can be more easily searched when the data is called, and the analysis accuracy is improved.
Preferably, the cleaning the collected data by the first repeated data, processing the error data, and supplementing the default data includes:
sequencing according to the ascending order of time to obtain collected data;
traversing the acquired data to eliminate default and invalid error data, supplementing the default data to clean the first repeated data, and recording the current valid data after the first repeated data is cleaned;
clearing the transition data in the current valid data.
By adopting the technical scheme, when the default data is supplemented, because the default data is generated by means of simulation and calculation, the calculated value of the data is possibly larger than the actual value, so that the data is unreasonable, and the reliability of the acquired data stored in the middle database can be increased to a certain extent by cleaning the jump data; by cleaning the transition data, the data in the intermediate database can have certain correlation with time, such as growth along with the growth of time, and better data statistics and modeling analysis can be performed.
Preferably, the performing the dimension analysis on the collected data subjected to the first data cleansing includes:
determining the analysis time of the collected data;
determining an analysis dimension of the collected data;
processing the collected data in the analysis time according to the analysis dimension;
and storing the processed acquired data in a middle database.
By adopting the technical scheme, the user gas data can be acquired purposefully through analyzing the dimensions, so that the data analysis, calling and checking are facilitated, the gas use habit of the user is described to a certain extent by using the gas use model established by various analysis dimensions, and a certain basis can be provided for predicting the gas use behavior of the user and judging the gas use abnormity.
Preferably, the modeling of the gas data based on the collected data in the intermediate database includes:
acquiring analysis data of a minimum granularity dimension, and eliminating invalid data;
acquiring the time span of the model;
obtaining each attribute value of the minimum granularity dimension data of the model under the time span;
the model is stored.
By adopting the technical scheme, the approximate gas consumption range of the user at the specific time point of each month and the specific time point of each week can be determined conveniently.
Preferably, the analysis dimension includes a quarterly dimension, a month dimension, a week dimension, a day dimension, and an hour dimension, and the method further includes:
and establishing the relation among the quarterly dimensions, the month dimensions, the week dimensions, the day dimensions and the hour dimensions, so that the data of the same analysis dimension can be called out simultaneously.
By adopting the technical scheme, the data with required dimensionality can be called out at one time when data analysis is carried out, so that the model checking and data comparison are facilitated.
Preferably, the method further comprises:
and setting flow error analysis, carrying out error calculation on the acquired data, carrying out second-time data cleaning, and completing gas consumption behavior analysis.
By adopting the technical scheme, the flow error analysis method can be used for establishing the user gas use habit curve according to the historical model and predicting whether the current user data is within the reasonable deviation or not according to the corresponding trend of the dimensionality matching user gas use habit curve, so that the flow error analysis is carried out.
Preferably, the flow error analysis includes:
presetting indicating value errors in different instantaneous flow ranges;
acquiring the total gas consumption of a user in a preset time period;
acquiring specific flow data corresponding to different instantaneous flow ranges in the intermediate database, and fitting a regression line;
acquiring an error range of the gas consumption of the user according to the regression line integral and the indicating value error;
and judging whether the total gas consumption of the user is reasonable or not according to the error range of the gas consumption of the user.
By adopting the technical scheme, whether the gas data of the user is correct or not can be predicted, the gas consumption emergency of the user can be adapted, and if the gas data of the user exceeds the error range of the gas consumption of the user, an early warning can be given to warn that gas stealing, gas leakage or the situation that the data is not adjusted after the meter is installed and the like can occur.
In a second aspect, embodiments of the present application provide a gas data modeling system based on dimensional analysis, the system comprising:
the cleaning module is used for cleaning the acquired data through the first repeated data, processing error data and supplementing default data;
the dimension analysis module is used for carrying out dimension analysis on the acquired data subjected to the first repeated data cleaning;
the intermediate database is used for storing the acquired data after the dimension analysis;
and the modeling module is used for modeling the gas data based on the collected data in the intermediate database.
In a third aspect, an embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and running on the processor, where the processor, when executing the computer program, implements the steps of the gas data modeling method based on dimensional analysis as described in any one of the above.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program, which when executed by a processor, implements the steps of the gas data modeling method based on dimensional analysis as described in any one of the above.
In summary, compared with the prior art, the beneficial effects brought by the technical scheme provided by the embodiment of the present application at least include:
1. the method comprises the steps of establishing an intermediate database, cleaning collected data, effectively reducing invalid data in the data, cleaning error data, supplementing default data, storing the default data in the intermediate database, reducing interference of unnecessary data during modeling, enabling modeling data to be more accurate, and reducing data volume of original modeling data. The dimensionality of the processed collected data is divided, so that the data can be conveniently accessed, checked and analyzed, the data confusion is reduced, and the related data can be more easily searched when the data is called;
2. when the default data is supplemented, because the default data is generated by means of simulation and calculation, the calculated value of the data is possibly larger than the actual value, so that the data is unreasonable, and the reliability of the acquired data stored in the middle database can be increased to a certain extent by cleaning the jump data; by cleaning the jump data, the data in the intermediate database can have certain association with time, such as growth along with growth of time, and better data statistics and modeling analysis can be performed;
3. through the analysis of the dimensions, the gas data of the user can be acquired purposefully, the data analysis, calling and checking are facilitated, the gas usage habits of the user are described to a certain extent by using the gas usage models established through various analysis dimensions, and a certain basis can be provided for predicting the gas usage behaviors of the user and judging the gas usage abnormity.
Drawings
FIG. 1 is a schematic flow chart diagram of a gas data modeling method based on dimensional analysis according to an embodiment of the present application.
FIG. 2 is a schematic flow chart diagram of a gas data modeling method based on dimensional analysis according to another embodiment of the present application.
Fig. 3 is a second schematic flow chart of a gas data modeling method based on dimensional analysis according to another embodiment of the present application.
FIG. 4 is a third schematic flow chart of a gas data modeling method based on dimension analysis according to another embodiment of the present application.
FIG. 5 is a fourth flowchart of a gas data modeling method based on dimension analysis according to another embodiment of the present application.
FIG. 6 is a fifth flowchart of a gas data modeling method based on dimension analysis according to another embodiment of the present application.
Detailed Description
The present embodiment is only for explaining the present application, and it is not limited to the present application, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present application.
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," and the like in this application are used for distinguishing between similar items and items that have substantially the same function or similar functionality, and it should be understood that "first," "second," and "nth" do not have any logical or temporal dependency or limitation on the number or order of execution.
The term "at least one" in this application means one or more, and the meaning of "a plurality" means three or more, e.g. a plurality of first locations means three or more first locations.
The embodiments of the present application will be described in further detail with reference to the drawings attached hereto.
Referring to fig. 1, in one embodiment of the present application, there is provided a gas data modeling method based on dimensional analysis, the main steps of the method are described as follows:
s1: cleaning the acquired data through first repeated data, processing error data, and supplementing default data;
s2: performing dimension analysis on the acquired data subjected to the first repeated data cleaning, and storing the acquired data in an intermediate database;
s3: and carrying out gas data modeling based on the acquired data in the intermediate database.
Specifically, the gas data of the user is collected at regular time in the interval collection interval, which is determined according to the actual collection precision and can be set to any time interval of 1s, 5s, 10s, etc., in this embodiment, 10s is taken as an example; the collected data is collected user gas data; the error data is, for example, negative flow data, data having a jump compared with data of upstream and downstream, and the like, where the data of upstream and downstream refers to data located in a previous acquisition interval and a subsequent acquisition interval of the data. Normally collectedThe gas situation of the user gas data in the adjacent collection interval is not easy to have sudden jump, for example, the upstream collection data is 30m3The downstream data collection is 35m3H, current data acquisition is 100m3And h, jump possibly occurs, and data acquisition is wrong.
Specifically, in this embodiment, if the jump of the currently acquired data exceeds that of any one of the upstream and downstream data (data value × 10% × acquisition interval) compared with the upstream and downstream data, and the acquisition interval takes s as a unit, it is determined that the currently acquired data jumps, and the currently acquired data is cleaned.
In this embodiment, the supplementing the default data specifically includes: deleting the data value of the default data, supplementing the acquired data of the acquisition time of the default data, wherein the supplementing mode can be that a user gas use curve corresponding to the acquired data is fitted according to the time sequence, predicting the data value corresponding to the current default data according to the curve, for example, the acquired data at the 8 month 01 # 00 point of the current year is the default data, the default data value is A, deleting the default data value A, predicting the acquired data B at the current time according to the user gas use curve, and substituting the acquired data B into the data value of the acquired data at the 8 month 01 # 00 point of the current year.
The dimension analysis is to divide the data after the first re-cleaning into different dimensions, and the analyzed dimensions are divided into a year dimension, a quarter dimension, a month dimension, a week dimension, a day dimension, an hour dimension and the like.
The method for constructing the model can be as follows:
a floating ratio with respect to the average value is given based on the average gas usage every hour from monday to sunday in the week, and the maximum gas usage and the minimum gas usage every hour, and a regression curve of the gas usage is calculated.
And fitting a yearly gas utilization curve according to the total gas utilization amount of each month in the year and the gas utilization fluctuation of each month and according to the data of the past years.
The collected data are subjected to screening processing through the collected data cleaned and supplemented by the first repeated data, the data volume during modeling is reduced, error data are screened out, the accuracy of the model is improved, dimension analysis is performed on the collected data after the first repeated cleaning, the analysis speed of the model is improved, modeling analysis is performed on the gas using behaviors of the user, the gas using characteristics of the user are determined, a basis is provided for predicting the gas using behaviors of the user, and the main function of the model is to describe the gas using habits of the current user under a certain time span.
Referring to fig. 2, alternatively, in another embodiment, step S1 includes the following steps;
s11: sequencing according to the ascending order of time to obtain collected data;
s12: traversing the acquired data to eliminate default and invalid error data, supplementing the default data to clean the first repeated data, and recording the current valid data after the first repeated data is cleaned;
s13: clearing the transition data in the current valid data.
Specifically, the data are acquired according to the ascending time sequence, the acquired data are acquired in sequence, the acquired data are arranged, and when the data are cleaned, the logic between the data is better found, so that the data cleaning has certain logicality and basis, and the data can be better imported and exported.
The invalid data may be a negative data stream, or a data stream that should be growing for a certain period of time but exhibits a negative growth, etc.
Data can be screened one by one in a traversal mode, so that the situation that the prediction accuracy of the user gas data is low after the model is built due to the fact that problematic data are stored in the intermediate database is avoided.
The default data is supplemented with the following two ways: 1. fitting according to the acquired data of the current user in the time period containing the default data to form a fitting curve, and acquiring the acquired data of the time point corresponding to the current default data through the fitting curve; 2. and simulating the acquired data of the time point corresponding to the default data through the established user gas model to obtain the acquired data.
Through the first repeated data cleaning, part of useless data in the current acquired data is removed, the default data is supplemented, the acquired data is more complete to form effective data, and at the moment, jump data in the effective data is cleaned; processing the transition data after the first heavy data scrub has the following advantages: 1. when the default data is supplemented, because the default data is generated by means of simulation and calculation, the calculated value of the data is possibly larger than the actual value, so that the data is unreasonable, and the reliability of the acquired data stored in the middle database can be increased to a certain extent by cleaning the jump data; 2. by cleaning the transition data, the data in the intermediate database can have certain correlation with time, such as growth along with the growth of time, and better data statistics and modeling analysis can be performed.
Referring to fig. 3, alternatively, in another embodiment, the step S2 includes:
s21: determining the analysis time of the collected data;
s22: determining an analysis dimension of the collected data;
s23: processing the collected data in the analysis time according to the analysis dimension;
s24: and storing the processed acquired data in a middle database.
Specifically, in this embodiment, the analysis time of the collected data is specifically the time interval of the current collected data collection, for example, if the collected data collected in the current year is analyzed, the analysis time is the current year. In other embodiments of the present application, the analysis time may be any time interval, such as one month, one week, etc.
The analysis dimension is a dimension into which the current data can be divided, for example, the analysis time of the currently analyzed collected data is one year, the analysis dimension may include a quarter dimension, a month dimension, a week dimension, a day dimension, and an hour dimension, for example, the whole year in 2020 is the collected data of the analysis interval, and the analysis dimension may include a quarter dimension, a month dimension, a week dimension, a day dimension, an hour dimension, and other time dimensions smaller than the year; for example, the analysis time is collected data of 8 months in 2020, the analysis dimension can be a week dimension, a day dimension and an hour dimension, and through the analysis dimension, the gas data of the user can be purposefully obtained, so that the data analysis, the calling and the checking are facilitated, the gas usage habit of the user is described to a certain extent by using the gas usage model established by various analysis dimensions, and a certain basis can be provided for predicting the gas usage behavior of the user and judging the gas usage abnormity.
In this embodiment, the analysis time of the collected data is determined, that is, the collected data is arranged according to time sequence, and then the data is processed according to the analysis dimension, where the specific processing mode is as follows: the collected data are divided according to the dimension with the shortest time in the time sequence, then the collected data are combined into the time dimension with longer time through the time dimension with the shortest time to establish a plurality of dimensions, specifically, the collected data can be divided by taking the hour as the minimum unit, then the day data are formed by integrating the data of all the hour dimensions on the day, and the data integration effect is better.
Referring to fig. 4, alternatively, in another embodiment, the step S3 includes:
s31: acquiring analysis data of a minimum granularity dimension, and eliminating invalid data;
s32: acquiring the time span of the model;
s33: obtaining each attribute value of the minimum granularity dimension data of the model under the time span;
s34: the model is stored.
Specifically, in this embodiment, the invalid data includes default data, transition data, error data, and error data determined to be erroneous based on data analysis, and the data determined to be erroneous based on data analysis may include data in which deviation from a curve trajectory exceeds a predetermined value based on curve determination fitted to collected data collected a plurality of times in any one dimension higher than the minimum granularity dimension. For example, the minimum granularity dimension is an hour dimension, and a curve is formed by fitting data of an analysis dimension such as a day dimension, a week dimension and the like which are higher than the hour dimension.
The time span of the model, i.e. the time period of data collection contained in the model, such as the data of the month of 8 months in 2021, is obtained, and under the time span, the model contains the data of the analysis dimension smaller than or equal to the unit of the time span, for example, the model of 8 months in 2021 contains the data of the week dimension, the day dimension and the hour dimension.
The attribute values comprise average temperature, maximum temperature, minimum temperature, temperature fluctuation rate and other numerical values, and determine the approximate gas usage range of the user at a specific time point of each month and a specific time point of each week.
Referring to fig. 5, optionally, in another embodiment, the analysis dimensions include a quarterly dimension, a month dimension, a week dimension, a day dimension, and an hour dimension, and the gas data modeling method based on the dimension analysis further includes the following steps:
s4: and establishing the relation among the quarterly dimensions, the month dimensions, the week dimensions, the day dimensions and the hour dimensions, so that the data of the same analysis dimension can be called out simultaneously.
Specifically, the relations among the quarterly dimensions, the month dimensions, the week dimensions, the day dimensions and the hour dimensions are established, for example, when a data model of the quarterly dimensions is selected, the data of all the quarters are automatically associated and are displayed in a centralized manner; in other embodiments, data of the same nature and dimension may also be associated, for example, data of the autumn of 2020 may be acquired, and data of all autumn of the rest year may be associated, and integrated to form a user gas model. The data of required dimensionality can be called out at one time when data analysis is carried out, so that model checking and data comparison are facilitated.
Referring to fig. 6, optionally, in another embodiment, the gas data modeling method based on the dimensional analysis further includes the following steps:
s5: and setting flow error analysis, carrying out error calculation on the acquired data, carrying out second-time data cleaning, and completing gas consumption behavior analysis.
Specifically, in this embodiment, flow error analysis is used to cope with sudden gas usage behavior.
The method for analyzing the flow error can be used for establishing a user gas use habit curve according to a historical model, and predicting whether current user data is within reasonable deviation or not according to the corresponding trend of the dimension matching user gas use habit curve, so that the flow error analysis is carried out to finish the gas use analysis.
Further, in the present embodiment, the flow error analysis includes:
1. presetting indicating value errors in different instantaneous flow ranges;
2. acquiring the total gas consumption of a user in a preset time period;
3. acquiring specific flow data corresponding to different instantaneous flow ranges in the intermediate database, and fitting a regression line;
4. acquiring an error range of the gas consumption of the user according to the regression line integral and the indicating value error;
5. and judging whether the total gas consumption of the user is reasonable or not according to the error range of the gas consumption of the user.
Examples are as follows:
preset 0-50m3The indicated value error of/h is + 0.62%, 50-200m3The indication error of/h is + 0.26%, 200- & lt 400 & gt3The indication error of/h is-0.26%, 400-3The indicated value error of/h is + 0.07%, and the total gas consumption in the time period (second unit) of T1 is S, wherein the instantaneous flow rate is 0-50m3The time length in/h is T2 and is 50-200m3The time length in/h is T3 at 200-3The time length in/h is T4 at 400-3The time length in/h is T5 and is more than 1000m3The time length in/h is T6; respectively acquiring specific flow data in each flow section in the acquisition records, and fitting a regression straight line; setting:
in the range of 0-50m3The fitted straight line in the/h range is K1X + B1;
at 50-200m3The fitted straight line in the/h range is K2X + B2;
at 200-3The fitted straight line in the/h range is K3X + B3;
at 400-3The fitted straight line in the/h range is K4X + B4;
at more than 1000m3The fitted straight line in the/h range is K5X + B5;
wherein K1, K2, K3, K4 and K5 are coefficients of a fitted straight line, B1, B2, B3, B4 and B5 are basic values of the fitted straight line, K1, K2, K3, K4, K5, B1, B2, B3, B4 and B5 change along with the change of initial gas usage values and gas usage speed in the current time period, and the fitted straight line can be a least square method.
Then the range of S during the time of T1 should be:
Figure BDA0003344877590000091
if S meets the equation, the normal gas using behavior is judged, and if not, the abnormal gas using behavior is judged.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
In an embodiment of the present application, a gas data modeling system based on dimensional analysis is provided, and the gas data modeling system based on dimensional analysis corresponds to the gas data modeling method based on dimensional analysis in the above embodiments one to one. The gas data modeling system based on the dimension analysis comprises:
the cleaning module is used for cleaning the acquired data through the first repeated data, processing error data and supplementing default data;
the dimension analysis module is used for carrying out dimension analysis on the acquired data subjected to the first repeated data cleaning;
the intermediate database is used for storing the acquired data after the dimension analysis;
and the modeling module is used for modeling the gas data based on the collected data in the intermediate database.
Further, in another embodiment, the system further comprises a gas analysis module.
And the gas utilization analysis module is used for carrying out reasonability analysis on the gas utilization data based on the gas data modeling result.
Further, in another embodiment, the cleaning module is further configured to obtain the collected data in ascending order of time; traversing the acquired data to eliminate default and invalid error data, supplementing the default data to clean the first repeated data, and recording the current valid data after the first repeated data is cleaned; clearing the transition data in the current valid data.
Further, in another embodiment, the dimension analysis module is further configured to determine an analysis time of the collected data; determining an analysis dimension of the collected data; processing the collected data in the analysis time according to the analysis dimension; and storing the processed acquired data in a middle database.
Further, in another embodiment, the modeling module is further configured to: acquiring analysis data of a minimum granularity dimension, and eliminating invalid data; acquiring the time span of the model; obtaining each attribute value of the minimum granularity dimension data of the model under the time span; the model is stored.
Further, in another embodiment, the dimension analysis module is further configured to establish a connection between the quarterly dimensions, the monthly dimensions, the weekly dimensions, the daily dimensions, and the hourly dimensions, so that data of the same analysis dimension can be called out simultaneously.
Further, in another embodiment, the gas data modeling system based on dimensional analysis further comprises an error analysis module.
And the error analysis module is used for setting flow error analysis, carrying out error calculation on the acquired data, carrying out second-time data cleaning and completing gas analysis.
Further, in another embodiment, the flow error analysis includes:
1. presetting indicating value errors in different instantaneous flow ranges;
2. acquiring the total gas consumption of a user in a preset time period;
3. acquiring specific flow data corresponding to different instantaneous flow ranges in the intermediate database, and fitting a regression line;
4. acquiring an error range of the gas consumption of the user according to the regression line integral and the indicating value error;
5. and judging whether the total gas consumption of the user is reasonable or not according to the error range of the gas consumption of the user.
The modules of the gas data modeling system based on the dimension analysis can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent of a processor in the electronic device, or can be stored in a memory in the electronic device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment of the embodiments of the present application, an electronic device is provided, which may be a server. The electronic device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device may be implemented by any type of volatile or non-volatile storage devices, including but not limited to: magnetic disk, optical disk, EEPROM (Electrically-Erasable Programmable Read Only Memory), EPROM (Erasable Programmable Read Only Memory), SRAM (Static Random Access Memory), ROM (Read-Only Memory), magnetic Memory, flash Memory, PROM (Programmable Read-Only Memory). The memory of the electronic device provides an environment for the operation of an operating system and computer programs stored therein. The network interface of the electronic device is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to realize the steps of the gas data modeling method based on the dimension analysis in the embodiment.
In an embodiment of the present application, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor implements the gas data modeling method steps based on dimensional analysis described in the above embodiment. The computer-readable storage medium includes a ROM (Read-Only Memory), a RAM (Random-Access Memory), a CD-ROM (Compact Disc Read-Only Memory), a magnetic disk, a floppy disk, and the like.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the system described in this application is divided into different functional units or modules to perform all or part of the above-mentioned functions.

Claims (10)

1. A gas data modeling method based on dimensional analysis is characterized by comprising the following steps:
cleaning the acquired data through first repeated data, processing error data, and supplementing default data;
performing dimension analysis on the acquired data subjected to the first repeated data cleaning, and storing the acquired data in an intermediate database;
and carrying out gas data modeling based on the acquired data in the intermediate database.
2. The dimension analysis-based gas data modeling method according to claim 1, wherein said cleaning the collected data by a first time data, processing the error data, and supplementing the default data comprises:
sequencing according to the ascending order of time to obtain collected data;
traversing the acquired data to eliminate default and invalid error data, supplementing the default data to clean the first repeated data, and recording the current valid data after the first repeated data is cleaned;
clearing the transition data in the current valid data.
3. The dimension analysis-based gas data modeling method according to claim 1, wherein performing the dimension analysis on the collected data subjected to the first data washing comprises:
determining the analysis time of the collected data;
determining an analysis dimension of the collected data;
processing the collected data in the analysis time according to the analysis dimension;
and storing the processed acquired data in a middle database.
4. The dimension analysis-based gas data modeling method according to claim 3, wherein the gas data modeling based on the collected data in the intermediate database comprises:
acquiring analysis data of a minimum granularity dimension, and eliminating invalid data;
acquiring the time span of the model;
obtaining each attribute value of the minimum granularity dimension data of the model under the time span;
the model is stored.
5. The dimensional analysis-based gas data modeling method of claim 3, wherein said analysis dimensions include a quarterly dimension, a month dimension, a week dimension, a day dimension, an hour dimension, said method further comprising:
and establishing the relation among the quarterly dimensions, the month dimensions, the week dimensions, the day dimensions and the hour dimensions, so that the data of the same analysis dimension can be called out simultaneously.
6. The dimensional analysis-based gas data modeling method according to any one of claims 1-5, characterized in that the method further comprises:
and setting flow error analysis, carrying out error calculation on the acquired data, carrying out second-time data cleaning, and completing gas consumption behavior analysis.
7. The dimensional analysis-based gas data modeling method of claim 6, wherein said flow error analysis comprises:
presetting indicating value errors in different instantaneous flow ranges;
acquiring the total gas consumption of a user in a preset time period;
acquiring specific flow data corresponding to different instantaneous flow ranges in the intermediate database, and fitting a regression line;
acquiring an error range of the gas consumption of the user according to the regression line integral and the indicating value error;
and judging whether the total gas consumption of the user is reasonable or not according to the error range of the gas consumption of the user.
8. A gas data modeling system based on dimensional analysis, the system comprising:
the cleaning module is used for cleaning the acquired data through the first repeated data, processing error data and supplementing default data;
the dimension analysis module is used for carrying out dimension analysis on the acquired data subjected to the first repeated data cleaning;
the intermediate database is used for storing the acquired data after the dimension analysis;
and the modeling module is used for modeling the gas data based on the collected data in the intermediate database.
9. An electronic device, comprising a memory, a processor and a computer program stored in the memory and running on the processor, wherein the processor when executing the computer program implements the steps of the gas data modeling method based on dimensional analysis of any one of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the steps of the method for modeling gas data based on dimensional analysis according to any one of claims 1 to 7.
CN202111323216.8A 2021-11-09 2021-11-09 Gas data modeling method and system based on dimension analysis and electronic equipment Pending CN114328444A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111323216.8A CN114328444A (en) 2021-11-09 2021-11-09 Gas data modeling method and system based on dimension analysis and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111323216.8A CN114328444A (en) 2021-11-09 2021-11-09 Gas data modeling method and system based on dimension analysis and electronic equipment

Publications (1)

Publication Number Publication Date
CN114328444A true CN114328444A (en) 2022-04-12

Family

ID=81044540

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111323216.8A Pending CN114328444A (en) 2021-11-09 2021-11-09 Gas data modeling method and system based on dimension analysis and electronic equipment

Country Status (1)

Country Link
CN (1) CN114328444A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115131166A (en) * 2022-06-15 2022-09-30 北京市燃气集团有限责任公司 System, method and device for checking stolen gas
CN116451043A (en) * 2023-06-12 2023-07-18 天津新科成套仪表有限公司 Fault model building system based on user gas meter measurement data analysis
CN116756629A (en) * 2023-05-24 2023-09-15 深圳市爱路恩济能源技术有限公司 Gas consumption analysis method and device for gas users

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115131166A (en) * 2022-06-15 2022-09-30 北京市燃气集团有限责任公司 System, method and device for checking stolen gas
CN116756629A (en) * 2023-05-24 2023-09-15 深圳市爱路恩济能源技术有限公司 Gas consumption analysis method and device for gas users
CN116756629B (en) * 2023-05-24 2024-04-19 深圳市爱路恩济能源技术有限公司 Gas consumption analysis method and device for gas users
CN116451043A (en) * 2023-06-12 2023-07-18 天津新科成套仪表有限公司 Fault model building system based on user gas meter measurement data analysis
CN116451043B (en) * 2023-06-12 2023-09-05 天津新科成套仪表有限公司 Fault model building system based on user gas meter measurement data analysis

Similar Documents

Publication Publication Date Title
CN114328444A (en) Gas data modeling method and system based on dimension analysis and electronic equipment
JP4269066B2 (en) How to analyze the functionality of a parallel processing system
CN115370973B (en) Water supply leakage monitoring method and device, storage medium and electronic equipment
CN110837933A (en) Leakage identification method, device, equipment and storage medium based on neural network
CN109922212B (en) Method and device for predicting time-interval telephone traffic ratio
CN114222974B (en) Estimating processing impact of user interface changes using state space models
CN110688433B (en) Path-based feature generation method and device
CN113849166A (en) Intelligent water environment building block type zero-code development platform
JP2009186251A (en) Method for interpolating values of rainfall and summing total rainfall
CN108694472B (en) Prediction error extreme value analysis method, device, computer equipment and storage medium
WO2021151304A1 (en) Method and apparatus for hysteretic processing of time series data, electronic device, and storage medium
CN113205230A (en) Data prediction method, device and equipment based on model set and storage medium
CN112184415A (en) Data processing method and device, electronic equipment and storage medium
CN116228175A (en) Real-time management system based on enterprise-level engineering construction cost
CN108805385A (en) A kind of method, apparatus and equipment of the management state of evaluation trade company
CN114116482A (en) Method and device for testing stability of CAD (computer-aided design) software, storage medium and processor
US20130030862A1 (en) Trend-based target setting for process control
CN109214603A (en) Value of house prediction technique, device, computer equipment and storage medium
Zu Nonparametric specification tests for stochastic volatility models based on volatility density
CN115170166B (en) Big data sensing method and system for judging monopoly behavior
CN114338429B (en) Network bandwidth determining method and device and electronic equipment
CN117764638B (en) Electricity selling data prediction method, system, equipment and storage medium for power supply enterprises
CN117952281B (en) User water demand prediction method, device and storage medium
Bontempi et al. The use of intelligent data analysis techniques for system-level design: a software estimation example
CN113743532A (en) Anomaly detection method, device, equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination