CN117034197A - Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection - Google Patents
Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection Download PDFInfo
- Publication number
- CN117034197A CN117034197A CN202311029259.4A CN202311029259A CN117034197A CN 117034197 A CN117034197 A CN 117034197A CN 202311029259 A CN202311029259 A CN 202311029259A CN 117034197 A CN117034197 A CN 117034197A
- Authority
- CN
- China
- Prior art keywords
- data
- point
- detection
- variable point
- multidimensional
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 79
- 238000004458 analytical method Methods 0.000 title claims abstract description 13
- 230000005611 electricity Effects 0.000 claims abstract description 90
- 238000012360 testing method Methods 0.000 claims abstract description 44
- 230000006399 behavior Effects 0.000 claims abstract description 21
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 17
- 238000004364 calculation method Methods 0.000 claims abstract description 11
- 238000004140 cleaning Methods 0.000 claims abstract description 11
- 230000001932 seasonal effect Effects 0.000 claims abstract description 9
- 230000000737 periodic effect Effects 0.000 claims abstract description 7
- 230000004927 fusion Effects 0.000 claims abstract description 4
- 238000000034 method Methods 0.000 claims description 36
- 230000002159 abnormal effect Effects 0.000 claims description 12
- 238000004422 calculation algorithm Methods 0.000 claims description 10
- 238000009499 grossing Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 8
- 238000007405 data analysis Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 5
- 239000013256 coordination polymer Substances 0.000 claims description 4
- 230000000694 effects Effects 0.000 claims description 4
- 230000007774 longterm Effects 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 19
- 230000008859 change Effects 0.000 description 7
- 238000004590 computer program Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000002023 wood Substances 0.000 description 2
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000005684 electric field Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
- G06F18/15—Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2123/00—Data types
- G06F2123/02—Data types in the time domain, e.g. time-series data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Economics (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Probability & Statistics with Applications (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application provides an enterprise electricity consumption typical mode analysis method based on multidimensional Isolate-Detect multi-point detection, which comprises the steps of firstly cleaning data according to collected industrial electricity consumption time sequence data; carrying out seasonal decomposition on the data after data cleaning, eliminating a periodic mode of the data, and extracting residual items of the data; the residual error items obtained after decomposition are subjected to multidimensional variable point detection based on multidimensional Isolate-Detect, and only a single variable point is detected in each divided time interval, so that the variable point detection in a multidimensional electricity utilization curve is more accurate, and the calculation efficiency is higher; combining the mean value test statistic and the maximum value test statistic to obtain a possible variable point set; the detected variable point set is subjected to multi-sequence fusion to screen possible variable point sets, and the final power consumption data variable point position is determined; and finally, dividing the standardized time series data based on the obtained variable point positions, and extracting typical electricity consumption behaviors of the electricity consumption.
Description
Technical Field
The application relates to the technical field of power data analysis, in particular to an enterprise power consumption typical mode analysis method based on multidimensional Isolate-Detect multi-point detection.
Background
Along with the continuous improvement of the informatization degree of the power grid, the rapid development of digitization and intelligence, and the power industry has been in the age of large data of power. As an energy system for economic development and human life dependence, the power system can generate huge amount, rapid growth and rich types of data when in operation. Analysis of electricity usage behavior may discover rules, relationships, trends, etc. in the large data of electricity to learn about electricity usage characteristics. Therefore, common transformation points are detected for each industry, typical electricity utilization behaviors are extracted, and the power consumption condition, the electricity utilization characteristics and the electricity utilization behaviors can be better analyzed and known, so that powerful support is provided for power supply and management.
But the industrial electricity consumption data has the defects of large electricity consumption, complex electricity consumption law of various industries and the like, and the data has the characteristics of periodicity, seasonality and the like, so that the analysis of electricity consumption behavior is challenged. Today's power data has high dimensionality, multiple types and a substantial amount of characteristics, and there are higher demands on data analysis techniques.
Disclosure of Invention
Therefore, aiming at the blank and the deficiency existing in the prior art, the application provides an enterprise electricity consumption typical mode analysis method based on multi-dimensional Isolate-Detect multi-point detection, which is used for improving the defect that the traditional electricity consumption behavior analysis model does not consider time sequence and stage change, establishing a multi-dimensional Isolate-Detect model aiming at industrial electricity consumption time sequence data, detecting common points of industry and extracting typical electricity consumption behaviors. Extracting a day of a weekday and a day of a weekend based on historical electricity data to detect a difference in electricity usage behavior between the weekday and the weekend; the variable point detection is carried out by using a multidimensional detected-Detect algorithm, so that the variable point detection precision is improved; and finally, based on the detected variable point position, extracting typical electricity utilization behaviors from the electricity utilization time sequence data in a segmented mode.
The method mainly comprises the steps of data collection, data cleaning, residual error item extraction, variable point detection, variable point screening and typical electricity utilization behavior extraction. Firstly, cleaning data according to the collected industrial electricity time sequence data; carrying out seasonal decomposition on the data after data cleaning, eliminating a periodic mode of the data, and extracting residual items of the data; and the residual error items obtained after decomposition are subjected to multidimensional variable point detection based on multidimensional Isolate-Detect, and only a single variable point is detected in each divided time interval, so that the variable point detection in a multidimensional electricity utilization curve is more accurate and the calculation efficiency is higher. Combining the mean value test statistic and the maximum value test statistic to obtain a possible variable point set; the detected variable point set is subjected to multi-sequence fusion to screen possible variable point sets, and the final power consumption data variable point position is determined; and finally, dividing the standardized time series data based on the obtained variable point positions, and extracting typical electricity consumption behaviors of the electricity consumption. The multidimensional Isolate-Det ect method provided by the application considers the sequence correlation among multidimensional data, can effectively detect the common transformation point of the multidimensional data, and improves the precision of multidimensional transformation point detection. Therefore, a more real and objective time sequence electric field scene is obtained, and a more reasonable electric strategy can be formulated in the follow-up planning.
The technical scheme is as follows:
an enterprise electricity consumption typical pattern analysis method based on multidimensional Isolate-Detect multiple point detection is characterized in that: firstly, cleaning data according to the collected industrial electricity time sequence data; carrying out seasonal decomposition on the data after data cleaning, eliminating a periodic mode of the data, and extracting residual items of the data; the residual error items obtained after decomposition are subjected to multidimensional variable point detection based on multidimensional Isolate-Detect, and only a single variable point is detected in each divided time interval, so that the variable point detection in a multidimensional electricity utilization curve is more accurate, and the calculation efficiency is higher; combining the mean value test statistic and the maximum value test statistic to obtain a possible variable point set; the detected variable point set is subjected to multi-sequence fusion to screen possible variable point sets, and the final power consumption data variable point position is determined; and finally, dividing the standardized time series data based on the obtained variable point positions, and extracting typical electricity consumption behaviors of the electricity consumption.
Further, the method comprises the following steps:
step 1: collecting power consumption time sequence data of each industry;
step 2: filling missing values of the time sequence data of the electricity consumption of each industry, extracting the electricity consumption data of a certain day of a working day or the electricity consumption data of a certain day of a weekend, removing abnormal values by using a 3 sigma principle, and performing standardization processing;
step 3: smoothing the cleaned data, carrying out seasonal decomposition on the smoothed data, and eliminating periodicity and trending of the data to obtain residual items of the smoothed data;
step 4: calculating CUSUM statistics of each time point of each industrial electricity utilization data residual error item, and calculating the average value and the maximum value of the CUSUM statistics of each time point, wherein the average value and the maximum value are respectively recorded as M test statistics and T test statistics;
step 5: detecting common variable points of all industrial electricity data by using an Isolate-Detect algorithm, presetting a threshold according to a threshold formula of the Isolate-Detect algorithm, judging a time point when M test statistics exceeds the threshold as a variable point, putting the obtained variable point into a variable point set, and recording the variable point set as a variable point set CP_M; determining a time point of the T test statistic exceeding a threshold as a variable point, putting the obtained variable point into a variable point set, and marking the variable point set as a variable point set CP_T;
step 6: recording a variable point set obtained by data analysis as a variable point set CP, taking out two elements with difference absolute values smaller than 3 in the set CP_M and the set CP_T, and selecting a smaller value to put into the set CP to obtain a final variable point set;
step 7: and segmenting the standardized power utilization time sequence data based on the obtained variable point positions, and averaging the industrial power utilization quantity of each time sequence data segment to obtain the typical power utilization behavior of each time sequence data segment.
Further, in step 2, after filling the missing values in the industrial electricity data, extracting the electricity data of a certain day in the working days and the electricity data of a certain day in the weekends respectively to perform subsequent data analysis, and analyzing the industrial electricity behavior differences of the working days and the weekends; and eliminating abnormal values by using a 3 sigma principle, and avoiding the influence of the abnormal values on the subsequent power consumption behavior analysis.
In step 3, smoothing the cleaned data by using a Savitzky-Golay filter, and when the loss function obtains the minimum value, optimizing the fitting effect of the original data; and obtaining a fitting value of the smoothed original data through a sliding window so as to effectively reduce the noise of the data.
Further, in step 3, for the smoothed data, an addition model is used to perform trend decomposition, and the smoothed power consumption data is decomposed into a trend part, a period part and a residual term part, so as to reject the periodicity and the trend of the data, which is expressed as follows:
X i,t =T i,t +S i,t +C i,t +I i,t , (7)
wherein T is i Representing the long-term time trend of the ith industry, S i Representing seasonal time trend of the ith industry, C i Representing the periodic time trend of the ith industry, I i Representing the remaining residual terms of the ith industry.
Further, in step 4, the average value of the CUSUM statistics at each time point is calculated by using the data residual terms obtained after decomposition, and the formula is as follows:
wherein i isRefers to the ith industry, s refers to the start point of the detection section, e refers to the end point of the detection section, b refers to the time point of detection, n refers to the total length of the detection section, I i,t Refers to the data residual items after the i-th industry t moment smoothing process,refers to the ith industry in the detection section s, e]CUSUM statistic at time b in, p is the total number of data dimensions, +.>Refers to the detection of the detection interval [ s, e]CUSUM statistic mean at time b in the inner, recorded as
Further, in step 4, the maximum value of the CUSUM statistic at each time point is calculated by using the data residual term obtained after decomposition, and the formula is as follows:
wherein I denotes the ith industry, s denotes the start point of the detection section, e denotes the end point of the detection section, b denotes the time point of detection, n denotes the total length of the detection section, I i,t Refers to the data residual items after the i-th industry t moment smoothing process,refers to the ith industry in the detection section s, e]CUSUM statistic at time b in, p is the total number of data dimensions, +.>Refers to the detection of the detection interval [ s, e]The maximum of the CUSUM statistic at time b within.
Further, in step 5, based on the calculated M test statistic and T test statistic, the common variation point of the electricity consumption of each industry is detected by using the Isolate-Detect algorithm:
firstly, creating a section to be detected, and for a data sequence with a length of T, firstly setting a normal number lambda T Two sets of ordered k= [ T/λ were then created T ]Left and right extension sections of (a); the j-th right extension interval is R j =[1,min{jλ T ,T}]The j-th left extension section is L j =[max{1,T-jλ T +1},T]The method comprises the steps of carrying out a first treatment on the surface of the In ordered set S RL ={R 1 ,L 1 ,R 2 ,L 2 ,...,R K ,L K Collecting these intervals in }; then identify R 1 Checking the point with the largest statistical value;
based on the obtained test statistic, setting a threshold value, wherein the calculation formula of the threshold value is as follows:
wherein sigma is the standard deviation of input data, and C is a given parameter value; comparing the threshold value and the test statistic, and judging whether the time point is a variable point or not: if the value of the test statistic exceeds the threshold value, the corresponding point is regarded as a variable point; if the threshold is not exceeded, continuing to detect S RL Is the next interval of (c).
Further, in step 7, based on the detected variable point position, the normalized time series data is segmented, the average value of the power consumption of each industry in each segment is calculated, and the typical power consumption scene of each segment of time series data is extracted, wherein the power consumption average value calculation formula of the ith industry is as follows:
wherein τ k Refers to the position of the kth transition point,refers to the ith industry at the (k+1) thAnd the power consumption value after the position of the variable point is standardized.
Compared with the prior art, the application and the preferable scheme thereof have the beneficial effects that at least the following steps are included:
1. taking the difference of electricity consumption behaviors of the working days and the weekends into consideration, extracting data of a certain day in the working days and data of a certain day in the weekends for subsequent data analysis, and distinguishing different electricity consumption characteristics of the working days and the weekends;
2. trend decomposition is carried out on the power utilization time sequence data, modeling is carried out by using a residual sequence without trend items and period items in consideration of the fact that the period possibly causes unstable model abnormal detection effect, and the accuracy of subsequent modeling is improved;
3. calculating CUSUM mean statistics, and considering the influence of all dimensions on variable point detection;
4. calculating CUSUM maximum statistics, wherein only a sequence with the maximum CUSUM statistics value is used, and the CUSUM maximum statistics is not easily influenced by abnormal values;
5. compared with the traditional variable point detection method, the ID method only detects a single variable point each time in an isolated interval, so that the calculation efficiency is higher;
6. considering that the M test statistic is more easily affected by abnormal values and that the T test statistic is not stable, selecting a common variable point detected by two test statistic improves the accuracy of variable point detection.
9. The standardized time series data are segmented based on the final variable point position, and the average value of the industrial electricity data in each segment of time series data is calculated, so that the typical electricity consumption behavior of each segment can be extracted accurately and efficiently.
Drawings
The application is described in further detail below with reference to the attached drawings and detailed description:
FIG. 1 is a flow chart of an exemplary pattern analysis method for enterprise electricity consumption based on multi-dimensional Isolate-Detect multi-point detection in accordance with an embodiment of the present application;
FIG. 2 is a diagram of the original data after filling the missing values according to an embodiment of the present application;
FIG. 3 is a time series diagram of Monday power consumption data according to an embodiment of the present application;
FIG. 4 is a time sequence diagram of Saturday power consumption data according to an embodiment of the present application;
FIG. 5 is a timing chart of the power consumption data of each industry week after data cleansing according to the embodiment of the present application;
FIG. 6 is a timing chart of power consumption data of each industrial Saturday after data cleaning according to the embodiment of the application;
FIG. 7 is a graph of a smoothing of power consumption data for each industry week using a Savitzky-Golay filter in accordance with an embodiment of the present application;
FIG. 8 is a graph of a smoothing of power consumption data for each industry week using a Savitzky-Golay filter in accordance with an embodiment of the present application;
FIG. 9 is a sequence diagram of residual power consumption data of each industry week after using an addition model according to an embodiment of the present application;
FIG. 10 is a sequence diagram of residual power consumption data of each industrial Saturday after using an addition model according to an embodiment of the present application;
FIG. 11 is a timing diagram of M test statistics and T test statistics for Monday power consumption data according to an embodiment of the present application;
FIG. 12 is a timing diagram of M test statistics and T test statistics for Saturday power consumption data in accordance with an embodiment of the present application;
FIG. 13 is a graph showing a final change point of Monday power consumption detected by using a multidimensional Isolate-Detect algorithm according to an embodiment of the present application;
FIG. 14 is a diagram showing a final change point distribution of Saturday power consumption detected by using a multidimensional Isolate-Detect algorithm according to an embodiment of the present application;
FIG. 15 is a graph of a segmented mean of monday power usage data in accordance with an embodiment of the present application;
FIG. 16 is a graph of a segmented mean of Saturday power usage data in accordance with an embodiment of the present application.
Detailed Description
In order to make the features and advantages of the present patent more comprehensible, embodiments accompanied with figures are described in detail below:
the present embodiment uses the industrial electricity consumption data set from Fuzhou city of Fujian province to describe the technical solution of the present application clearly and completely. As shown in fig. 1, a detailed flow of the present application is provided below, and a specific application example for implementing the scheme is provided below:
step 1: and collecting power utilization time sequence data of each industry in the industry, wherein the data comprises an Gregorian calendar date, an industry type and power consumption.
Step 2: the missing values are filled with electrical timing data. The data with the power consumption being empty and 0 is marked as a missing value. Wherein, the data of the electricity consumption of the electronic machine and the wood processing is lost in 7 th and 6 th of 2020, the data of the electricity consumption of the electronic machine and the wood processing in each monday of the year of 2020 and 6 th to 9 th are respectively averaged, and the missing values are respectively interpolated by the data. A power usage timing diagram is obtained as shown in fig. 2. The data of monday and the data of Saturday are extracted from the electricity consumption time series data, respectively, as shown in fig. 3 and 4. And then, abnormal values are removed by using a 3 sigma principle, namely, in each dimension of data, data points with the deviation exceeding 3 times of standard deviation from the average value are regarded as abnormal data, and the detected abnormal data are removed. Because the electricity consumption difference among the industries is large, in order to avoid the influence on the change point detection caused by the large order of magnitude difference of the electricity consumption time sequence data of the industries, the electricity consumption of the industries is standardized respectively, and the formula is as follows:
wherein x is i,t Represents the electricity consumption, mu, of the ith industry at the t-th time point i Represents the electricity consumption average value sigma of the ith industry i Represents standard deviation of electricity consumption of the ith industry, X i,t The power consumption is represented by the standardized power consumption, and the power consumption time sequence diagram of each industry week and the power consumption time sequence diagram of Saturday after data cleaning are shown in fig. 5 and 6;
step 3: the cleaned data is smoothed by using a Savitzky-Golay filter, and the Savitzky-Golay filter is a filtering method based on local polynomial least square fitting in a time domain. Assume that the original time series data is X i ={X i,1 ,X i,2 ,...X i,t ...,X i,T In X }, by i,t Taking X as origin i,t And constructing a window array containing 2m+1 sample points, and constructing a p-order polynomial to fit the data in the window, wherein the p-order polynomial is as follows:
wherein, m is less than or equal to n and less than or equal to m, and p is less than or equal to 2m+1; the loss function is defined as follows:
when the loss function takes the minimum value, the fitting effect of the original data reaches the optimum. The fitting value of the smoothed original data is obtained through the sliding window, so that the noise of the data can be effectively reduced, the trend of the smoothed power consumption data of the monday is shown in fig. 7, and the trend of the smoothed power consumption data of the Saturday is shown in fig. 8;
the smoothed data is then trending decomposed using an additive model, which is expressed as follows:
wherein T is i Representing the long-term time trend of the ith industry, S i Representing seasonal time trend of the ith industry, C i Representing the periodic time trend of the ith industry, I i Representing the remaining residual terms of the ith industry. The smoothed electricity consumption data is decomposed into a trend part, a period part and a residual term part, and a residual sequence of the week electricity consumption data and a residual sequence of the Saturday electricity consumption data are respectively obtained, as shown in fig. 9 and 10.
Step 4: the CUSUM statistic of each industrial power consumption residual error item at each time point is calculated, and the main formula is as follows:
wherein I denotes the ith industry, s denotes the start point of the detection section, e denotes the end point of the detection section, b denotes the time point of detection, n denotes the total length of the detection section, I i,t Refers to the residual term of the data smoothed at the ith industry t moment,refers to the ith industry in the detection section s, e]CUSUM statistics at time b in.
Then, calculating the mean value and the maximum value of CUSUM statistics, namely M test statistics and T test statistics, of each industrial residual item at each time point, wherein the formula is as follows:
where p refers to the total number of data dimensions, in this embodiment p=9,refers to the detection of the detection interval [ s, e]CUSUM statistic mean at time b in. />Refers to the detection of the detection interval [ s, e]The maximum of the CUSUM statistic at time b within. The obtained M test statistic and T test statistic timing chart of the Monday power consumption data is shown in FIG. 11, and the M test statistic and T test statistic timing chart of the Saturday power consumption data is shown in FIG. 12.
Step 5: based on the M test statistic and the T test statistic, the variable points are detected by adopting an ID method, and the common variable points of the multi-dimensional power utilization data are output. The ID method mainly screens variable points through a threshold method, and the principle is as follows:
firstly, creating a section to be detected, and for a data sequence with a length of T, firstly setting a normal number lambda T Two sets of ordered k= [ T/λ were then created T ]Left and right extension sections of (a). The j-th right extension interval is R j =[1,min{jλ T ,T}]The j-th left extension section is L j =[max{1,T-jλ T +1},T]. In ordered set S RL ={R 1 ,L 1 ,R 2 ,L 2 ,...,R K ,L K These intervals are collected in }. Then ID identifies R 1 The point of greatest statistical magnitude is examined. Based on the obtained test statistic, setting a threshold value, wherein the calculation formula of the threshold value is as follows:
in zeta T For the resulting threshold value, σ is the standard deviation of the input data, C is a given parameter value, T is the length of the input data sequence, in this embodiment, the selection is madeT Saturday =111。
If the value of the test statistic exceeds the threshold, the point is considered a change point. If the threshold is not exceeded, continuing to detect S RL Is the next interval of (c). After detection, the ID algorithm starts a new round of detection from the end point (start point) of the extended section on the right (or left) where the detection occurs.
Step 6: and obtaining two variable point sets CP_MM and CP_MT of the power utilization data by using the two test statistics, taking out two elements with difference absolute values smaller than 3 in the set CP_MM and the set CP_MT, and selecting a smaller value to put into the set CP_M to obtain a final variable point set. As shown in fig. 13, the final change point detection result diagram of the power consumption data for the week is a total of 11 change points detected. The same detection method is adopted for the Saturday power consumption data, a final variable point set CP_S of the Saturday power consumption data is obtained, 10 variable points are detected in total, and a final variable point detection result diagram is shown in fig. 14.
Step 7: based on the detected variable point positions, the standardized time sequence data are segmented, the average value of the industrial electricity consumption in each segment is calculated, and the typical electricity consumption scene of each segment of time sequence data is extracted. The calculation formula of the electricity consumption average value of the ith industry is as follows:
wherein τ k Refers to the position of the kth transition point,refers to the electricity consumption value of the ith industry after the standardization of the (k+1) th variable point position. The sectional mean curve of the power consumption data of the week is shown in fig. 15, the sectional mean curve of the power consumption data of the week is shown in fig. 16, and the mean of the industries in the time sequence sub-scene is marked in the graph.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present application, and is not intended to limit the application in any way, and any person skilled in the art may make modifications or alterations to the disclosed technical content to the equivalent embodiments. However, any simple modification, equivalent variation and variation of the above embodiments according to the technical substance of the present application still fall within the protection scope of the technical solution of the present application.
The present application is not limited to the above-mentioned preferred embodiments, and any person can obtain other various methods for analyzing the typical mode of the enterprise electricity consumption based on multi-dimensional Isolate-Detect multi-point detection under the teaching of the present application, and all equivalent changes and modifications made according to the scope of the present application should be covered by the present application.
Claims (9)
1. An enterprise electricity consumption typical pattern analysis method based on multidimensional Isolate-Detect multiple point detection is characterized in that: firstly, cleaning data according to the collected industrial electricity time sequence data; carrying out seasonal decomposition on the data after data cleaning, eliminating a periodic mode of the data, and extracting residual items of the data; the residual error items obtained after decomposition are subjected to multidimensional variable point detection based on multidimensional Isolate-Detect, and only a single variable point is detected in each divided time interval, so that the variable point detection in a multidimensional electricity utilization curve is more accurate, and the calculation efficiency is higher; combining the mean value test statistic and the maximum value test statistic to obtain a possible variable point set; the detected variable point set is subjected to multi-sequence fusion to screen possible variable point sets, and the final power consumption data variable point position is determined; and finally, dividing the standardized time series data based on the obtained variable point positions, and extracting typical electricity consumption behaviors of the electricity consumption.
2. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 1, wherein the method comprises the following steps:
the method comprises the following steps:
step 1: collecting power consumption time sequence data of each industry;
step 2: filling missing values of the time sequence data of the electricity consumption of each industry, extracting the electricity consumption data of a certain day of a working day or the electricity consumption data of a certain day of a weekend, removing abnormal values by using a 3 sigma principle, and performing standardization processing;
step 3: smoothing the cleaned data, carrying out seasonal decomposition on the smoothed data, and eliminating periodicity and trending of the data to obtain residual items of the smoothed data;
step 4: calculating CUSUM statistics of each time point of each industrial electricity utilization data residual error item, and calculating the average value and the maximum value of the CUSUM statistics of each time point, wherein the average value and the maximum value are respectively recorded as M test statistics and T test statistics;
step 5: detecting common variable points of all industrial electricity data by using an Isolate-Detect algorithm, presetting a threshold according to a threshold formula of the Isolate-Detect algorithm, judging a time point when M test statistics exceeds the threshold as a variable point, putting the obtained variable point into a variable point set, and recording the variable point set as a variable point set CP_M; determining a time point of the T test statistic exceeding a threshold as a variable point, putting the obtained variable point into a variable point set, and marking the variable point set as a variable point set CP_T;
step 6: recording a variable point set obtained by data analysis as a variable point set CP, taking out two elements with difference absolute values smaller than 3 in the set CP_M and the set CP_T, and selecting a smaller value to put into the set CP to obtain a final variable point set;
step 7: and segmenting the standardized power utilization time sequence data based on the obtained variable point positions, and averaging the industrial power utilization quantity of each time sequence data segment to obtain the typical power utilization behavior of each time sequence data segment.
3. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 2, wherein the method comprises the following steps: in the step 2, after the missing values in the industrial electricity data are filled, the electricity data of one day in the working days and the electricity data of one day in the weekends are respectively extracted to carry out subsequent data analysis, and the industrial electricity behavior differences of the working days and the weekends are analyzed; and eliminating abnormal values by using a 3 sigma principle, and avoiding the influence of the abnormal values on the subsequent power consumption behavior analysis.
4. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 3, wherein the method comprises the following steps: in the step 3, smoothing the cleaned data by using a Savitzky-Golay filter, and when the loss function obtains the minimum value, the fitting effect of the original data reaches the optimal; and obtaining a fitting value of the smoothed original data through a sliding window so as to effectively reduce the noise of the data.
5. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 4, wherein the method comprises the following steps:
in step 3, for the smoothed data, an addition model is adopted to perform trend decomposition, and the smoothed power consumption data is decomposed into a trend part, a period part and a residual term part, so as to reject the periodicity and the trend of the data, which is expressed as follows:
X i,t =T i,t +S i,t +C i,t +I i,t , (1)
wherein T is i Representing the long-term time trend of the ith industry, S i Representing seasonal time trend of the ith industry, C i Representing the periodic time trend of the ith industry, I i Representing the remaining residual terms of the ith industry.
6. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 5, wherein the method comprises the following steps:
in step 4, calculating the average CUSUM statistic value of each time point by using the data residual items obtained after decomposition, wherein the formula is as follows:
wherein I denotes the ith industry, s denotes the start point of the detection section, e denotes the end point of the detection section, b denotes the time point of detection, n denotes the total length of the detection section, I i,t Refers to the data residual items after the i-th industry t moment smoothing process,refers to the ith industry in the detection section s, e]CUSUM statistic at time b in, p is the total number of data dimensions, +.>Refers to the detection of the detection interval [ s, e]CUSUM statistic mean at time b in the inner, recorded as +.>
7. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 6, wherein the method comprises the following steps:
in step 4, calculating the maximum value of CUSUM statistics at each time point by using the data residual items obtained after decomposition, wherein the formula is as follows:
wherein I denotes the ith industry, s denotes the start point of the detection section, e denotes the end point of the detection section, b denotes the time point of detection, n denotes the total length of the detection section, I i,t Refers to the data residual items after the i-th industry t moment smoothing process,refers to the ith industry in the detection section s, e]CUSUM statistic at time b in, p is the total number of data dimensions, +.>Refers to the detection of the detection interval [ s, e]The maximum of the CUSUM statistic at time b within.
8. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 7, wherein the method comprises the following steps:
in step 5, based on the calculated M test statistic and T test statistic, the common variation point of the electricity consumption of each industry is detected by using the isopate-Detect algorithm:
firstly, creating a section to be detected, and for a data sequence with a length of T, firstly setting a normal number lambda T Two sets of ordered k= [ T/λ were then created T ]Left and right extension sections of (a);the j-th right extension interval is R j =[1,min{jλ T ,T}]The j-th left extension section is L j =[max{1,T-jλ T +1},T]The method comprises the steps of carrying out a first treatment on the surface of the In ordered set S RL ={R 1 ,L 1 ,R 2 ,L 2 ,...,R K ,L K Collecting these intervals in }; then identify R 1 Checking the point with the largest statistical value;
based on the obtained test statistic, setting a threshold value, wherein the calculation formula of the threshold value is as follows:
wherein sigma is the standard deviation of input data, and C is a given parameter value; comparing the threshold value and the test statistic, and judging whether the time point is a variable point or not: if the value of the test statistic exceeds the threshold value, the corresponding point is regarded as a variable point; if the threshold is not exceeded, continuing to detect S RL Is the next interval of (c).
9. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 8, wherein the method comprises the following steps:
in step 7, based on the detected variable point position, segmenting the standardized time sequence data, calculating the average value of the power consumption of each industry in each segment, and extracting the typical power consumption scene of each segment of time sequence data, wherein the power consumption average value calculation formula of the ith industry is as follows:
wherein τ k Refers to the position of the kth transition point,refers to the electricity consumption value of the ith industry after the standardization of the (k+1) th variable point position.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311029259.4A CN117034197A (en) | 2023-08-16 | 2023-08-16 | Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311029259.4A CN117034197A (en) | 2023-08-16 | 2023-08-16 | Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117034197A true CN117034197A (en) | 2023-11-10 |
Family
ID=88640856
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311029259.4A Pending CN117034197A (en) | 2023-08-16 | 2023-08-16 | Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117034197A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117806220A (en) * | 2024-03-01 | 2024-04-02 | 连云港市新达电子技术有限公司 | Intelligent control system for power supply of petroleum logging instrument |
-
2023
- 2023-08-16 CN CN202311029259.4A patent/CN117034197A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117806220A (en) * | 2024-03-01 | 2024-04-02 | 连云港市新达电子技术有限公司 | Intelligent control system for power supply of petroleum logging instrument |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110163429B (en) | Short-term load prediction method based on similarity day optimization screening | |
CN112801388B (en) | Power load prediction method and system based on nonlinear time series algorithm | |
CN116010485B (en) | Unsupervised anomaly detection method for dynamic period time sequence | |
CN117034197A (en) | Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection | |
CN112183827A (en) | Method, device, equipment and storage medium for predicting express monthly pickup quantity | |
CN117152119A (en) | Profile flaw visual detection method based on image processing | |
CN114066262A (en) | Method, system and device for estimating cause-tracing reasoning of abnormal indexes after power grid dispatching and storage medium | |
CN117056688A (en) | New material production data management system and method based on data analysis | |
CN113449919A (en) | Power consumption prediction method and system based on feature and trend perception | |
CN117711295A (en) | Display module control method, device, chip and medium based on artificial intelligence | |
CN117828413A (en) | Transformer oil temperature prediction method and system based on LSTM neural network | |
CN110059126B (en) | LKJ abnormal value data-based complex correlation network analysis method and system | |
CN115186910A (en) | Grey fabric factory productivity prediction method based on LSTM and XGboost mixed model | |
CN110543869A (en) | Ball screw service life prediction method and device, computer equipment and storage medium | |
CN114118401A (en) | Neural network-based power distribution network flow prediction method, system, device and storage medium | |
Bakır et al. | Defect cause modeling with decision tree and regression analysis | |
CN116167489A (en) | Building energy data analysis and prediction method and system | |
CN113391987A (en) | Quality prediction method and device for online software system | |
CN110569277A (en) | Method and system for automatically identifying and classifying configuration data information | |
CN113553358B (en) | Data mining-based power grid equipment invalid data identification method and device | |
CN113157204B (en) | Disk capacity prediction method for identifying manual cleaning behavior based on second-order difference method | |
CN117937438A (en) | Method and system for identifying and correcting abnormal data of power grid dispatching control system | |
CN117609740B (en) | Intelligent prediction maintenance system based on industrial large model | |
CN117635178A (en) | Method, system and storage medium for identifying electricity stealing user based on discrete coefficient analysis | |
CN118427187A (en) | Data quality management method and system based on self-optimization subsystem |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |