CN117034197A - Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection - Google Patents

Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection Download PDF

Info

Publication number
CN117034197A
CN117034197A CN202311029259.4A CN202311029259A CN117034197A CN 117034197 A CN117034197 A CN 117034197A CN 202311029259 A CN202311029259 A CN 202311029259A CN 117034197 A CN117034197 A CN 117034197A
Authority
CN
China
Prior art keywords
data
point
detection
variable point
multidimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311029259.4A
Other languages
Chinese (zh)
Inventor
庄丹
林芊盈
张逸
马铁丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Normal University
Original Assignee
Fujian Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Normal University filed Critical Fujian Normal University
Priority to CN202311029259.4A priority Critical patent/CN117034197A/en
Publication of CN117034197A publication Critical patent/CN117034197A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • G06F18/15Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2123/00Data types
    • G06F2123/02Data types in the time domain, e.g. time-series data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Probability & Statistics with Applications (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides an enterprise electricity consumption typical mode analysis method based on multidimensional Isolate-Detect multi-point detection, which comprises the steps of firstly cleaning data according to collected industrial electricity consumption time sequence data; carrying out seasonal decomposition on the data after data cleaning, eliminating a periodic mode of the data, and extracting residual items of the data; the residual error items obtained after decomposition are subjected to multidimensional variable point detection based on multidimensional Isolate-Detect, and only a single variable point is detected in each divided time interval, so that the variable point detection in a multidimensional electricity utilization curve is more accurate, and the calculation efficiency is higher; combining the mean value test statistic and the maximum value test statistic to obtain a possible variable point set; the detected variable point set is subjected to multi-sequence fusion to screen possible variable point sets, and the final power consumption data variable point position is determined; and finally, dividing the standardized time series data based on the obtained variable point positions, and extracting typical electricity consumption behaviors of the electricity consumption.

Description

Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection
Technical Field
The application relates to the technical field of power data analysis, in particular to an enterprise power consumption typical mode analysis method based on multidimensional Isolate-Detect multi-point detection.
Background
Along with the continuous improvement of the informatization degree of the power grid, the rapid development of digitization and intelligence, and the power industry has been in the age of large data of power. As an energy system for economic development and human life dependence, the power system can generate huge amount, rapid growth and rich types of data when in operation. Analysis of electricity usage behavior may discover rules, relationships, trends, etc. in the large data of electricity to learn about electricity usage characteristics. Therefore, common transformation points are detected for each industry, typical electricity utilization behaviors are extracted, and the power consumption condition, the electricity utilization characteristics and the electricity utilization behaviors can be better analyzed and known, so that powerful support is provided for power supply and management.
But the industrial electricity consumption data has the defects of large electricity consumption, complex electricity consumption law of various industries and the like, and the data has the characteristics of periodicity, seasonality and the like, so that the analysis of electricity consumption behavior is challenged. Today's power data has high dimensionality, multiple types and a substantial amount of characteristics, and there are higher demands on data analysis techniques.
Disclosure of Invention
Therefore, aiming at the blank and the deficiency existing in the prior art, the application provides an enterprise electricity consumption typical mode analysis method based on multi-dimensional Isolate-Detect multi-point detection, which is used for improving the defect that the traditional electricity consumption behavior analysis model does not consider time sequence and stage change, establishing a multi-dimensional Isolate-Detect model aiming at industrial electricity consumption time sequence data, detecting common points of industry and extracting typical electricity consumption behaviors. Extracting a day of a weekday and a day of a weekend based on historical electricity data to detect a difference in electricity usage behavior between the weekday and the weekend; the variable point detection is carried out by using a multidimensional detected-Detect algorithm, so that the variable point detection precision is improved; and finally, based on the detected variable point position, extracting typical electricity utilization behaviors from the electricity utilization time sequence data in a segmented mode.
The method mainly comprises the steps of data collection, data cleaning, residual error item extraction, variable point detection, variable point screening and typical electricity utilization behavior extraction. Firstly, cleaning data according to the collected industrial electricity time sequence data; carrying out seasonal decomposition on the data after data cleaning, eliminating a periodic mode of the data, and extracting residual items of the data; and the residual error items obtained after decomposition are subjected to multidimensional variable point detection based on multidimensional Isolate-Detect, and only a single variable point is detected in each divided time interval, so that the variable point detection in a multidimensional electricity utilization curve is more accurate and the calculation efficiency is higher. Combining the mean value test statistic and the maximum value test statistic to obtain a possible variable point set; the detected variable point set is subjected to multi-sequence fusion to screen possible variable point sets, and the final power consumption data variable point position is determined; and finally, dividing the standardized time series data based on the obtained variable point positions, and extracting typical electricity consumption behaviors of the electricity consumption. The multidimensional Isolate-Det ect method provided by the application considers the sequence correlation among multidimensional data, can effectively detect the common transformation point of the multidimensional data, and improves the precision of multidimensional transformation point detection. Therefore, a more real and objective time sequence electric field scene is obtained, and a more reasonable electric strategy can be formulated in the follow-up planning.
The technical scheme is as follows:
an enterprise electricity consumption typical pattern analysis method based on multidimensional Isolate-Detect multiple point detection is characterized in that: firstly, cleaning data according to the collected industrial electricity time sequence data; carrying out seasonal decomposition on the data after data cleaning, eliminating a periodic mode of the data, and extracting residual items of the data; the residual error items obtained after decomposition are subjected to multidimensional variable point detection based on multidimensional Isolate-Detect, and only a single variable point is detected in each divided time interval, so that the variable point detection in a multidimensional electricity utilization curve is more accurate, and the calculation efficiency is higher; combining the mean value test statistic and the maximum value test statistic to obtain a possible variable point set; the detected variable point set is subjected to multi-sequence fusion to screen possible variable point sets, and the final power consumption data variable point position is determined; and finally, dividing the standardized time series data based on the obtained variable point positions, and extracting typical electricity consumption behaviors of the electricity consumption.
Further, the method comprises the following steps:
step 1: collecting power consumption time sequence data of each industry;
step 2: filling missing values of the time sequence data of the electricity consumption of each industry, extracting the electricity consumption data of a certain day of a working day or the electricity consumption data of a certain day of a weekend, removing abnormal values by using a 3 sigma principle, and performing standardization processing;
step 3: smoothing the cleaned data, carrying out seasonal decomposition on the smoothed data, and eliminating periodicity and trending of the data to obtain residual items of the smoothed data;
step 4: calculating CUSUM statistics of each time point of each industrial electricity utilization data residual error item, and calculating the average value and the maximum value of the CUSUM statistics of each time point, wherein the average value and the maximum value are respectively recorded as M test statistics and T test statistics;
step 5: detecting common variable points of all industrial electricity data by using an Isolate-Detect algorithm, presetting a threshold according to a threshold formula of the Isolate-Detect algorithm, judging a time point when M test statistics exceeds the threshold as a variable point, putting the obtained variable point into a variable point set, and recording the variable point set as a variable point set CP_M; determining a time point of the T test statistic exceeding a threshold as a variable point, putting the obtained variable point into a variable point set, and marking the variable point set as a variable point set CP_T;
step 6: recording a variable point set obtained by data analysis as a variable point set CP, taking out two elements with difference absolute values smaller than 3 in the set CP_M and the set CP_T, and selecting a smaller value to put into the set CP to obtain a final variable point set;
step 7: and segmenting the standardized power utilization time sequence data based on the obtained variable point positions, and averaging the industrial power utilization quantity of each time sequence data segment to obtain the typical power utilization behavior of each time sequence data segment.
Further, in step 2, after filling the missing values in the industrial electricity data, extracting the electricity data of a certain day in the working days and the electricity data of a certain day in the weekends respectively to perform subsequent data analysis, and analyzing the industrial electricity behavior differences of the working days and the weekends; and eliminating abnormal values by using a 3 sigma principle, and avoiding the influence of the abnormal values on the subsequent power consumption behavior analysis.
In step 3, smoothing the cleaned data by using a Savitzky-Golay filter, and when the loss function obtains the minimum value, optimizing the fitting effect of the original data; and obtaining a fitting value of the smoothed original data through a sliding window so as to effectively reduce the noise of the data.
Further, in step 3, for the smoothed data, an addition model is used to perform trend decomposition, and the smoothed power consumption data is decomposed into a trend part, a period part and a residual term part, so as to reject the periodicity and the trend of the data, which is expressed as follows:
X i,t =T i,t +S i,t +C i,t +I i,t , (7)
wherein T is i Representing the long-term time trend of the ith industry, S i Representing seasonal time trend of the ith industry, C i Representing the periodic time trend of the ith industry, I i Representing the remaining residual terms of the ith industry.
Further, in step 4, the average value of the CUSUM statistics at each time point is calculated by using the data residual terms obtained after decomposition, and the formula is as follows:
wherein i isRefers to the ith industry, s refers to the start point of the detection section, e refers to the end point of the detection section, b refers to the time point of detection, n refers to the total length of the detection section, I i,t Refers to the data residual items after the i-th industry t moment smoothing process,refers to the ith industry in the detection section s, e]CUSUM statistic at time b in, p is the total number of data dimensions, +.>Refers to the detection of the detection interval [ s, e]CUSUM statistic mean at time b in the inner, recorded as
Further, in step 4, the maximum value of the CUSUM statistic at each time point is calculated by using the data residual term obtained after decomposition, and the formula is as follows:
wherein I denotes the ith industry, s denotes the start point of the detection section, e denotes the end point of the detection section, b denotes the time point of detection, n denotes the total length of the detection section, I i,t Refers to the data residual items after the i-th industry t moment smoothing process,refers to the ith industry in the detection section s, e]CUSUM statistic at time b in, p is the total number of data dimensions, +.>Refers to the detection of the detection interval [ s, e]The maximum of the CUSUM statistic at time b within.
Further, in step 5, based on the calculated M test statistic and T test statistic, the common variation point of the electricity consumption of each industry is detected by using the Isolate-Detect algorithm:
firstly, creating a section to be detected, and for a data sequence with a length of T, firstly setting a normal number lambda T Two sets of ordered k= [ T/λ were then created T ]Left and right extension sections of (a); the j-th right extension interval is R j =[1,min{jλ T ,T}]The j-th left extension section is L j =[max{1,T-jλ T +1},T]The method comprises the steps of carrying out a first treatment on the surface of the In ordered set S RL ={R 1 ,L 1 ,R 2 ,L 2 ,...,R K ,L K Collecting these intervals in }; then identify R 1 Checking the point with the largest statistical value;
based on the obtained test statistic, setting a threshold value, wherein the calculation formula of the threshold value is as follows:
wherein sigma is the standard deviation of input data, and C is a given parameter value; comparing the threshold value and the test statistic, and judging whether the time point is a variable point or not: if the value of the test statistic exceeds the threshold value, the corresponding point is regarded as a variable point; if the threshold is not exceeded, continuing to detect S RL Is the next interval of (c).
Further, in step 7, based on the detected variable point position, the normalized time series data is segmented, the average value of the power consumption of each industry in each segment is calculated, and the typical power consumption scene of each segment of time series data is extracted, wherein the power consumption average value calculation formula of the ith industry is as follows:
wherein τ k Refers to the position of the kth transition point,refers to the ith industry at the (k+1) thAnd the power consumption value after the position of the variable point is standardized.
Compared with the prior art, the application and the preferable scheme thereof have the beneficial effects that at least the following steps are included:
1. taking the difference of electricity consumption behaviors of the working days and the weekends into consideration, extracting data of a certain day in the working days and data of a certain day in the weekends for subsequent data analysis, and distinguishing different electricity consumption characteristics of the working days and the weekends;
2. trend decomposition is carried out on the power utilization time sequence data, modeling is carried out by using a residual sequence without trend items and period items in consideration of the fact that the period possibly causes unstable model abnormal detection effect, and the accuracy of subsequent modeling is improved;
3. calculating CUSUM mean statistics, and considering the influence of all dimensions on variable point detection;
4. calculating CUSUM maximum statistics, wherein only a sequence with the maximum CUSUM statistics value is used, and the CUSUM maximum statistics is not easily influenced by abnormal values;
5. compared with the traditional variable point detection method, the ID method only detects a single variable point each time in an isolated interval, so that the calculation efficiency is higher;
6. considering that the M test statistic is more easily affected by abnormal values and that the T test statistic is not stable, selecting a common variable point detected by two test statistic improves the accuracy of variable point detection.
9. The standardized time series data are segmented based on the final variable point position, and the average value of the industrial electricity data in each segment of time series data is calculated, so that the typical electricity consumption behavior of each segment can be extracted accurately and efficiently.
Drawings
The application is described in further detail below with reference to the attached drawings and detailed description:
FIG. 1 is a flow chart of an exemplary pattern analysis method for enterprise electricity consumption based on multi-dimensional Isolate-Detect multi-point detection in accordance with an embodiment of the present application;
FIG. 2 is a diagram of the original data after filling the missing values according to an embodiment of the present application;
FIG. 3 is a time series diagram of Monday power consumption data according to an embodiment of the present application;
FIG. 4 is a time sequence diagram of Saturday power consumption data according to an embodiment of the present application;
FIG. 5 is a timing chart of the power consumption data of each industry week after data cleansing according to the embodiment of the present application;
FIG. 6 is a timing chart of power consumption data of each industrial Saturday after data cleaning according to the embodiment of the application;
FIG. 7 is a graph of a smoothing of power consumption data for each industry week using a Savitzky-Golay filter in accordance with an embodiment of the present application;
FIG. 8 is a graph of a smoothing of power consumption data for each industry week using a Savitzky-Golay filter in accordance with an embodiment of the present application;
FIG. 9 is a sequence diagram of residual power consumption data of each industry week after using an addition model according to an embodiment of the present application;
FIG. 10 is a sequence diagram of residual power consumption data of each industrial Saturday after using an addition model according to an embodiment of the present application;
FIG. 11 is a timing diagram of M test statistics and T test statistics for Monday power consumption data according to an embodiment of the present application;
FIG. 12 is a timing diagram of M test statistics and T test statistics for Saturday power consumption data in accordance with an embodiment of the present application;
FIG. 13 is a graph showing a final change point of Monday power consumption detected by using a multidimensional Isolate-Detect algorithm according to an embodiment of the present application;
FIG. 14 is a diagram showing a final change point distribution of Saturday power consumption detected by using a multidimensional Isolate-Detect algorithm according to an embodiment of the present application;
FIG. 15 is a graph of a segmented mean of monday power usage data in accordance with an embodiment of the present application;
FIG. 16 is a graph of a segmented mean of Saturday power usage data in accordance with an embodiment of the present application.
Detailed Description
In order to make the features and advantages of the present patent more comprehensible, embodiments accompanied with figures are described in detail below:
the present embodiment uses the industrial electricity consumption data set from Fuzhou city of Fujian province to describe the technical solution of the present application clearly and completely. As shown in fig. 1, a detailed flow of the present application is provided below, and a specific application example for implementing the scheme is provided below:
step 1: and collecting power utilization time sequence data of each industry in the industry, wherein the data comprises an Gregorian calendar date, an industry type and power consumption.
Step 2: the missing values are filled with electrical timing data. The data with the power consumption being empty and 0 is marked as a missing value. Wherein, the data of the electricity consumption of the electronic machine and the wood processing is lost in 7 th and 6 th of 2020, the data of the electricity consumption of the electronic machine and the wood processing in each monday of the year of 2020 and 6 th to 9 th are respectively averaged, and the missing values are respectively interpolated by the data. A power usage timing diagram is obtained as shown in fig. 2. The data of monday and the data of Saturday are extracted from the electricity consumption time series data, respectively, as shown in fig. 3 and 4. And then, abnormal values are removed by using a 3 sigma principle, namely, in each dimension of data, data points with the deviation exceeding 3 times of standard deviation from the average value are regarded as abnormal data, and the detected abnormal data are removed. Because the electricity consumption difference among the industries is large, in order to avoid the influence on the change point detection caused by the large order of magnitude difference of the electricity consumption time sequence data of the industries, the electricity consumption of the industries is standardized respectively, and the formula is as follows:
wherein x is i,t Represents the electricity consumption, mu, of the ith industry at the t-th time point i Represents the electricity consumption average value sigma of the ith industry i Represents standard deviation of electricity consumption of the ith industry, X i,t The power consumption is represented by the standardized power consumption, and the power consumption time sequence diagram of each industry week and the power consumption time sequence diagram of Saturday after data cleaning are shown in fig. 5 and 6;
step 3: the cleaned data is smoothed by using a Savitzky-Golay filter, and the Savitzky-Golay filter is a filtering method based on local polynomial least square fitting in a time domain. Assume that the original time series data is X i ={X i,1 ,X i,2 ,...X i,t ...,X i,T In X }, by i,t Taking X as origin i,t And constructing a window array containing 2m+1 sample points, and constructing a p-order polynomial to fit the data in the window, wherein the p-order polynomial is as follows:
wherein, m is less than or equal to n and less than or equal to m, and p is less than or equal to 2m+1; the loss function is defined as follows:
when the loss function takes the minimum value, the fitting effect of the original data reaches the optimum. The fitting value of the smoothed original data is obtained through the sliding window, so that the noise of the data can be effectively reduced, the trend of the smoothed power consumption data of the monday is shown in fig. 7, and the trend of the smoothed power consumption data of the Saturday is shown in fig. 8;
the smoothed data is then trending decomposed using an additive model, which is expressed as follows:
wherein T is i Representing the long-term time trend of the ith industry, S i Representing seasonal time trend of the ith industry, C i Representing the periodic time trend of the ith industry, I i Representing the remaining residual terms of the ith industry. The smoothed electricity consumption data is decomposed into a trend part, a period part and a residual term part, and a residual sequence of the week electricity consumption data and a residual sequence of the Saturday electricity consumption data are respectively obtained, as shown in fig. 9 and 10.
Step 4: the CUSUM statistic of each industrial power consumption residual error item at each time point is calculated, and the main formula is as follows:
wherein I denotes the ith industry, s denotes the start point of the detection section, e denotes the end point of the detection section, b denotes the time point of detection, n denotes the total length of the detection section, I i,t Refers to the residual term of the data smoothed at the ith industry t moment,refers to the ith industry in the detection section s, e]CUSUM statistics at time b in.
Then, calculating the mean value and the maximum value of CUSUM statistics, namely M test statistics and T test statistics, of each industrial residual item at each time point, wherein the formula is as follows:
where p refers to the total number of data dimensions, in this embodiment p=9,refers to the detection of the detection interval [ s, e]CUSUM statistic mean at time b in. />Refers to the detection of the detection interval [ s, e]The maximum of the CUSUM statistic at time b within. The obtained M test statistic and T test statistic timing chart of the Monday power consumption data is shown in FIG. 11, and the M test statistic and T test statistic timing chart of the Saturday power consumption data is shown in FIG. 12.
Step 5: based on the M test statistic and the T test statistic, the variable points are detected by adopting an ID method, and the common variable points of the multi-dimensional power utilization data are output. The ID method mainly screens variable points through a threshold method, and the principle is as follows:
firstly, creating a section to be detected, and for a data sequence with a length of T, firstly setting a normal number lambda T Two sets of ordered k= [ T/λ were then created T ]Left and right extension sections of (a). The j-th right extension interval is R j =[1,min{jλ T ,T}]The j-th left extension section is L j =[max{1,T-jλ T +1},T]. In ordered set S RL ={R 1 ,L 1 ,R 2 ,L 2 ,...,R K ,L K These intervals are collected in }. Then ID identifies R 1 The point of greatest statistical magnitude is examined. Based on the obtained test statistic, setting a threshold value, wherein the calculation formula of the threshold value is as follows:
in zeta T For the resulting threshold value, σ is the standard deviation of the input data, C is a given parameter value, T is the length of the input data sequence, in this embodiment, the selection is madeT Saturday =111。
If the value of the test statistic exceeds the threshold, the point is considered a change point. If the threshold is not exceeded, continuing to detect S RL Is the next interval of (c). After detection, the ID algorithm starts a new round of detection from the end point (start point) of the extended section on the right (or left) where the detection occurs.
Step 6: and obtaining two variable point sets CP_MM and CP_MT of the power utilization data by using the two test statistics, taking out two elements with difference absolute values smaller than 3 in the set CP_MM and the set CP_MT, and selecting a smaller value to put into the set CP_M to obtain a final variable point set. As shown in fig. 13, the final change point detection result diagram of the power consumption data for the week is a total of 11 change points detected. The same detection method is adopted for the Saturday power consumption data, a final variable point set CP_S of the Saturday power consumption data is obtained, 10 variable points are detected in total, and a final variable point detection result diagram is shown in fig. 14.
Step 7: based on the detected variable point positions, the standardized time sequence data are segmented, the average value of the industrial electricity consumption in each segment is calculated, and the typical electricity consumption scene of each segment of time sequence data is extracted. The calculation formula of the electricity consumption average value of the ith industry is as follows:
wherein τ k Refers to the position of the kth transition point,refers to the electricity consumption value of the ith industry after the standardization of the (k+1) th variable point position. The sectional mean curve of the power consumption data of the week is shown in fig. 15, the sectional mean curve of the power consumption data of the week is shown in fig. 16, and the mean of the industries in the time sequence sub-scene is marked in the graph.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present application, and is not intended to limit the application in any way, and any person skilled in the art may make modifications or alterations to the disclosed technical content to the equivalent embodiments. However, any simple modification, equivalent variation and variation of the above embodiments according to the technical substance of the present application still fall within the protection scope of the technical solution of the present application.
The present application is not limited to the above-mentioned preferred embodiments, and any person can obtain other various methods for analyzing the typical mode of the enterprise electricity consumption based on multi-dimensional Isolate-Detect multi-point detection under the teaching of the present application, and all equivalent changes and modifications made according to the scope of the present application should be covered by the present application.

Claims (9)

1. An enterprise electricity consumption typical pattern analysis method based on multidimensional Isolate-Detect multiple point detection is characterized in that: firstly, cleaning data according to the collected industrial electricity time sequence data; carrying out seasonal decomposition on the data after data cleaning, eliminating a periodic mode of the data, and extracting residual items of the data; the residual error items obtained after decomposition are subjected to multidimensional variable point detection based on multidimensional Isolate-Detect, and only a single variable point is detected in each divided time interval, so that the variable point detection in a multidimensional electricity utilization curve is more accurate, and the calculation efficiency is higher; combining the mean value test statistic and the maximum value test statistic to obtain a possible variable point set; the detected variable point set is subjected to multi-sequence fusion to screen possible variable point sets, and the final power consumption data variable point position is determined; and finally, dividing the standardized time series data based on the obtained variable point positions, and extracting typical electricity consumption behaviors of the electricity consumption.
2. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 1, wherein the method comprises the following steps:
the method comprises the following steps:
step 1: collecting power consumption time sequence data of each industry;
step 2: filling missing values of the time sequence data of the electricity consumption of each industry, extracting the electricity consumption data of a certain day of a working day or the electricity consumption data of a certain day of a weekend, removing abnormal values by using a 3 sigma principle, and performing standardization processing;
step 3: smoothing the cleaned data, carrying out seasonal decomposition on the smoothed data, and eliminating periodicity and trending of the data to obtain residual items of the smoothed data;
step 4: calculating CUSUM statistics of each time point of each industrial electricity utilization data residual error item, and calculating the average value and the maximum value of the CUSUM statistics of each time point, wherein the average value and the maximum value are respectively recorded as M test statistics and T test statistics;
step 5: detecting common variable points of all industrial electricity data by using an Isolate-Detect algorithm, presetting a threshold according to a threshold formula of the Isolate-Detect algorithm, judging a time point when M test statistics exceeds the threshold as a variable point, putting the obtained variable point into a variable point set, and recording the variable point set as a variable point set CP_M; determining a time point of the T test statistic exceeding a threshold as a variable point, putting the obtained variable point into a variable point set, and marking the variable point set as a variable point set CP_T;
step 6: recording a variable point set obtained by data analysis as a variable point set CP, taking out two elements with difference absolute values smaller than 3 in the set CP_M and the set CP_T, and selecting a smaller value to put into the set CP to obtain a final variable point set;
step 7: and segmenting the standardized power utilization time sequence data based on the obtained variable point positions, and averaging the industrial power utilization quantity of each time sequence data segment to obtain the typical power utilization behavior of each time sequence data segment.
3. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 2, wherein the method comprises the following steps: in the step 2, after the missing values in the industrial electricity data are filled, the electricity data of one day in the working days and the electricity data of one day in the weekends are respectively extracted to carry out subsequent data analysis, and the industrial electricity behavior differences of the working days and the weekends are analyzed; and eliminating abnormal values by using a 3 sigma principle, and avoiding the influence of the abnormal values on the subsequent power consumption behavior analysis.
4. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 3, wherein the method comprises the following steps: in the step 3, smoothing the cleaned data by using a Savitzky-Golay filter, and when the loss function obtains the minimum value, the fitting effect of the original data reaches the optimal; and obtaining a fitting value of the smoothed original data through a sliding window so as to effectively reduce the noise of the data.
5. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 4, wherein the method comprises the following steps:
in step 3, for the smoothed data, an addition model is adopted to perform trend decomposition, and the smoothed power consumption data is decomposed into a trend part, a period part and a residual term part, so as to reject the periodicity and the trend of the data, which is expressed as follows:
X i,t =T i,t +S i,t +C i,t +I i,t , (1)
wherein T is i Representing the long-term time trend of the ith industry, S i Representing seasonal time trend of the ith industry, C i Representing the periodic time trend of the ith industry, I i Representing the remaining residual terms of the ith industry.
6. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 5, wherein the method comprises the following steps:
in step 4, calculating the average CUSUM statistic value of each time point by using the data residual items obtained after decomposition, wherein the formula is as follows:
wherein I denotes the ith industry, s denotes the start point of the detection section, e denotes the end point of the detection section, b denotes the time point of detection, n denotes the total length of the detection section, I i,t Refers to the data residual items after the i-th industry t moment smoothing process,refers to the ith industry in the detection section s, e]CUSUM statistic at time b in, p is the total number of data dimensions, +.>Refers to the detection of the detection interval [ s, e]CUSUM statistic mean at time b in the inner, recorded as +.>
7. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 6, wherein the method comprises the following steps:
in step 4, calculating the maximum value of CUSUM statistics at each time point by using the data residual items obtained after decomposition, wherein the formula is as follows:
wherein I denotes the ith industry, s denotes the start point of the detection section, e denotes the end point of the detection section, b denotes the time point of detection, n denotes the total length of the detection section, I i,t Refers to the data residual items after the i-th industry t moment smoothing process,refers to the ith industry in the detection section s, e]CUSUM statistic at time b in, p is the total number of data dimensions, +.>Refers to the detection of the detection interval [ s, e]The maximum of the CUSUM statistic at time b within.
8. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 7, wherein the method comprises the following steps:
in step 5, based on the calculated M test statistic and T test statistic, the common variation point of the electricity consumption of each industry is detected by using the isopate-Detect algorithm:
firstly, creating a section to be detected, and for a data sequence with a length of T, firstly setting a normal number lambda T Two sets of ordered k= [ T/λ were then created T ]Left and right extension sections of (a);the j-th right extension interval is R j =[1,min{jλ T ,T}]The j-th left extension section is L j =[max{1,T-jλ T +1},T]The method comprises the steps of carrying out a first treatment on the surface of the In ordered set S RL ={R 1 ,L 1 ,R 2 ,L 2 ,...,R K ,L K Collecting these intervals in }; then identify R 1 Checking the point with the largest statistical value;
based on the obtained test statistic, setting a threshold value, wherein the calculation formula of the threshold value is as follows:
wherein sigma is the standard deviation of input data, and C is a given parameter value; comparing the threshold value and the test statistic, and judging whether the time point is a variable point or not: if the value of the test statistic exceeds the threshold value, the corresponding point is regarded as a variable point; if the threshold is not exceeded, continuing to detect S RL Is the next interval of (c).
9. The method for analyzing the typical pattern of enterprise electricity consumption based on multidimensional Isolate-Detect multiple point detection as claimed in claim 8, wherein the method comprises the following steps:
in step 7, based on the detected variable point position, segmenting the standardized time sequence data, calculating the average value of the power consumption of each industry in each segment, and extracting the typical power consumption scene of each segment of time sequence data, wherein the power consumption average value calculation formula of the ith industry is as follows:
wherein τ k Refers to the position of the kth transition point,refers to the electricity consumption value of the ith industry after the standardization of the (k+1) th variable point position.
CN202311029259.4A 2023-08-16 2023-08-16 Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection Pending CN117034197A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311029259.4A CN117034197A (en) 2023-08-16 2023-08-16 Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311029259.4A CN117034197A (en) 2023-08-16 2023-08-16 Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection

Publications (1)

Publication Number Publication Date
CN117034197A true CN117034197A (en) 2023-11-10

Family

ID=88640856

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311029259.4A Pending CN117034197A (en) 2023-08-16 2023-08-16 Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection

Country Status (1)

Country Link
CN (1) CN117034197A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117806220A (en) * 2024-03-01 2024-04-02 连云港市新达电子技术有限公司 Intelligent control system for power supply of petroleum logging instrument

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117806220A (en) * 2024-03-01 2024-04-02 连云港市新达电子技术有限公司 Intelligent control system for power supply of petroleum logging instrument

Similar Documents

Publication Publication Date Title
CN110163429B (en) Short-term load prediction method based on similarity day optimization screening
CN112801388B (en) Power load prediction method and system based on nonlinear time series algorithm
CN116010485B (en) Unsupervised anomaly detection method for dynamic period time sequence
CN117034197A (en) Enterprise power consumption typical mode analysis method based on multidimensional Isolate-detection multi-point detection
CN112183827A (en) Method, device, equipment and storage medium for predicting express monthly pickup quantity
CN117152119A (en) Profile flaw visual detection method based on image processing
CN114066262A (en) Method, system and device for estimating cause-tracing reasoning of abnormal indexes after power grid dispatching and storage medium
CN117056688A (en) New material production data management system and method based on data analysis
CN113449919A (en) Power consumption prediction method and system based on feature and trend perception
CN117711295A (en) Display module control method, device, chip and medium based on artificial intelligence
CN117828413A (en) Transformer oil temperature prediction method and system based on LSTM neural network
CN110059126B (en) LKJ abnormal value data-based complex correlation network analysis method and system
CN115186910A (en) Grey fabric factory productivity prediction method based on LSTM and XGboost mixed model
CN110543869A (en) Ball screw service life prediction method and device, computer equipment and storage medium
CN114118401A (en) Neural network-based power distribution network flow prediction method, system, device and storage medium
Bakır et al. Defect cause modeling with decision tree and regression analysis
CN116167489A (en) Building energy data analysis and prediction method and system
CN113391987A (en) Quality prediction method and device for online software system
CN110569277A (en) Method and system for automatically identifying and classifying configuration data information
CN113553358B (en) Data mining-based power grid equipment invalid data identification method and device
CN113157204B (en) Disk capacity prediction method for identifying manual cleaning behavior based on second-order difference method
CN117937438A (en) Method and system for identifying and correcting abnormal data of power grid dispatching control system
CN117609740B (en) Intelligent prediction maintenance system based on industrial large model
CN117635178A (en) Method, system and storage medium for identifying electricity stealing user based on discrete coefficient analysis
CN118427187A (en) Data quality management method and system based on self-optimization subsystem

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination