CN114519651A - Intelligent power distribution method based on electric power big data - Google Patents

Intelligent power distribution method based on electric power big data Download PDF

Info

Publication number
CN114519651A
CN114519651A CN202210086499.7A CN202210086499A CN114519651A CN 114519651 A CN114519651 A CN 114519651A CN 202210086499 A CN202210086499 A CN 202210086499A CN 114519651 A CN114519651 A CN 114519651A
Authority
CN
China
Prior art keywords
data
family
power
power consumption
electricity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210086499.7A
Other languages
Chinese (zh)
Inventor
赵威
张智勇
王云峰
方宽
谭正卯
李明涛
赵金石
于洋
曹勇
付鑫
向哲宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Heilongjiang Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Heilongjiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Heilongjiang Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202210086499.7A priority Critical patent/CN114519651A/en
Publication of CN114519651A publication Critical patent/CN114519651A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/003Load forecast, e.g. methods or systems for forecasting future load demand
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Marketing (AREA)
  • Databases & Information Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Power Engineering (AREA)
  • Primary Health Care (AREA)
  • Probability & Statistics with Applications (AREA)
  • General Health & Medical Sciences (AREA)
  • Water Supply & Treatment (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Public Health (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

An intelligent power distribution method based on electric power big data belongs to the technical field of electric power load prediction and power distribution. The method solves the problems that the existing power utilization prediction models are poor in accuracy and easy to cause energy waste when wide-range power utilization prediction is carried out on the power utilization rule of the user, and the historical power utilization data of all users in the area to be distributed are obtained and preprocessed; obtaining a sample set; clustering the sample set by using an AP clustering algorithm based on the similarity of dynamic time programming to obtain n types of samples; respectively drawing power consumption curve graphs of the n types of samples, and dividing users corresponding to power consumption data into three types of non-migratory birds, all-family migratory birds or non-all-family migratory birds according to the power consumption curve graphs; marking the types of all sample data; establishing an electricity utilization classification model by using the labeled sample data and a K adjacent classification algorithm; and acquiring the power utilization type of the user, and adjusting the power distribution strategy in the next time period according to the power utilization type of the user.

Description

Intelligent power distribution method based on electric power big data
Technical Field
The invention belongs to the technical field of power load prediction and power distribution.
Background
The power load prediction is an important work of a power department and provides a basis for intelligent power allocation, and particularly in a thermal power plant, the power consumption of each region in each quarter needs to be estimated in advance, so that an effective basis is provided for intelligent power distribution, and the waste of energy is avoided; the existing prediction method only predicts the power consumption change rule, cannot predict the type and the power consumption time of each user, and carries out accurate power distribution adjustment, so that the problems of poor power consumption prediction accuracy, inaccurate power distribution and easy energy waste exist.
Disclosure of Invention
The invention aims to solve the problems that the existing power utilization prediction models are poor in accuracy and easy to cause energy waste when wide-range power utilization prediction is carried out on the power utilization rule of a user, and provides an intelligent power distribution method based on large electric power data.
The invention discloses an intelligent power distribution method based on electric power big data, which comprises the following steps:
step one, acquiring historical electricity utilization data of all electricity utilization users, and preprocessing the historical electricity utilization data; obtaining a sample set;
secondly, clustering the sample set by using an AP clustering algorithm based on dynamic time programming similarity to obtain n types of samples; wherein n is a positive integer;
respectively drawing power consumption curve graphs of the n types of samples, and dividing users corresponding to the power consumption data into three types of non-migratory birds, full-family migratory birds or non-full-family migratory birds according to the power consumption curve graphs;
fourthly, performing type marking on all sample data; respectively marked as non-migratory household electricity, whole-family migratory household electricity or non-whole-family migratory household electricity;
taking 70% of the labeled sample data as a training set and 30% as a verification set, and establishing a power utilization classification model by utilizing a K-neighborhood classification algorithm;
step six, sending the last annual power consumption data of each household in the area to be distributed to the power consumption classification model, and obtaining the power consumption type of each user;
seventhly, acquiring the type proportion of the users covered in each sub-area distributed by the power distribution node, and when the sum of the proportion of the whole-family type migrant bird family users covered in any sub-area and the proportion of the non-whole-family type migrant bird family users covered in any sub-area is more than 60%, respectively drawing annual power utilization curves of each whole-family type migrant bird family and each non-whole-family type migrant bird family, and acquiring a daily total power consumption time-varying curve of the sub-area in the last year;
when the proportion of the non-migratory bird family users covered in any subarea is more than 60%, acquiring the average power consumption of each user every day; acquiring the average daily electricity consumption of the last year in the sub-area;
estimating the total power consumption of the subarea in the next year according to the change curve of the total power consumption of the subarea in the previous year along with time or the average daily power consumption according to the type proportion of the covered users in each subarea, and acquiring a power distribution strategy of the subarea in the next year;
the next annual power distribution strategy of the power distribution subareas with the proportion of non-migratory bird family users larger than 60% is as follows: the daily distribution amount of the sub-area is carried out by taking the average daily power consumption of the previous year as a reference;
the next annual power distribution strategy of the power distribution subareas with the whole family type bird waiting family users and the non-whole family type bird waiting family users accounting for more than 60 percent is as follows: and adjusting the daily distribution quantity of the sub-area according to a curve of the daily total electricity consumption of the sub-area along with the change of time in the last year.
Furthermore, the method also comprises a step of comparing the power consumption curve of the previous month with the power consumption curve of the month corresponding to the previous year, wherein in the step, if the power consumption curve of the previous month is different from the power consumption curve of the month corresponding to the previous year, the change trend of the power consumption curve of the next month is predicted according to the curve change point and the curve trend of the change point, and the sub-area power distribution curve of the next month is adjusted;
and if the average daily power consumption of the previous month is different from the average daily power consumption of the corresponding month in the previous year, acquiring a power consumption change curve of the previous month, predicting the average daily power distribution of the next month, and adjusting the average daily power distribution of the sub-area.
Further, in the invention, in the first step, the historical electricity utilization data is preprocessed; the specific method for acquiring sample data comprises the following steps:
a1, filtering historical electricity consumption data of all users in an area to be distributed with electricity, and eliminating users with zero electricity consumption; acquiring primary screening data;
a2, eliminating abnormal values in the primary screening data through a quartile method;
and A3, performing linear transformation on the data with the abnormal values removed through dispersion standardization, mapping the power consumption data of each user to [0-1], and acquiring sample data.
Further, in the present invention, in step a3, the specific formula for mapping the power consumption data of each user to [0-1] is:
Figure BDA0003488170370000021
wherein x represents normalized electric quantity data of a user, xiRepresenting the ith electrical quantity data, minx, of the useriFor the minimum value of the user's electricity data, maxxiAnd the maximum value of the user electric quantity data is obtained.
Further, in the present invention, in the second step, the specific method for obtaining n kinds of cluster data by using the AP clustering algorithm based on the dynamic time programming similarity includes:
step B1, representing the sample set into a time sequence, calculating the distance between each point of the two time sequences, and obtaining a distance matrix;
step B2, finding a path from the upper left corner to the lower right corner in the distance matrix to ensure that the sum of elements on the path is minimum, and obtaining a DTW matrix between every two sequences in the sample set;
b3, calculating a similarity matrix by using a DTW matrix between every two sequences in the sample set;
and step B4, using the similarity matrix as the input of the AP clustering algorithm to obtain n kinds of clustering results.
Further, in the third step, the specific method for dividing the users corresponding to the power consumption data into three types of non-migratory birds, whole-family migratory birds or non-whole-family migratory birds includes:
and respectively drawing power consumption curve graphs of the n types of samples, and respectively setting the n types of samples as a whole-family type waiting bird, a non-whole-family type waiting bird and a non-waiting bird according to the power consumption characteristics of the whole-family type waiting bird, the non-whole-family type waiting bird and the non-waiting bird.
Further, in the fifth step, the method for establishing the electricity utilization classification model by using the K-nearest neighbor algorithm comprises the following steps:
step C1, taking part of the labeled sample data as a training set, and establishing a power utilization classification model by using a K proximity algorithm;
and step C2, using the other part of the labeled sample data as a verification sample set, verifying the electricity utilization classification model established in the step one, adjusting the threshold value K of the electricity utilization classification model until the accuracy rate of the electricity utilization classification model reaches 80%, and obtaining a K-neighborhood algorithm to establish the electricity utilization classification model.
Further, in the present invention, in step C2, the method for adjusting the threshold K of the electricity classification model includes:
d1, utilizing dynamic time planning to regulate the time axis of the time sequence in the training sample set, and calculating the distance between each test point and the time sequence point in the training sample set;
d2, selecting k points with the minimum distance from the test points, counting the occurrence frequency of the category where the k points are located, and taking the frequency with the highest occurrence frequency as the category of the test points;
and D3, judging whether the type of the test point is correct or not according to the label of the test point, recording a judgment result, continuously testing the model by using the verification set until the verification is finished, judging whether the accuracy of the electricity utilization classification model reaches 80%, if so, taking the current K value as the threshold value of the electricity utilization classification model, otherwise, taking K as K-1, returning to execute the step D1 until the accuracy of the electricity utilization classification model reaches 80%, and finishing the adjustment of the threshold value K of the electricity classification model.
The dynamic time planning algorithm is disassembled to be better integrated into the K proximity algorithm, when the K proximity algorithm calculates the distance between a point in a known category data set and a current point, the optimal alignment between two observation sequences is quickly found by using the dynamic time planning algorithm, then the calculation of the dynamic time warping distance is carried out, and finally the distance between matrixes is obtained. The performance of the model is measured by comparing class labels in the retention dataset to the prediction of the classifier. The data set is pre-divided into a 70% training set and a 30% testing set, the model is trained by using the data with the labels after the previous clustering, and the obtained model is used for the testing set to obtain the accuracy of model prediction so as to judge the quality of the model. Compared with the traditional K proximity algorithm, the accuracy of the improved K proximity algorithm in the aspect of predicting the electricity utilization type of the user can reach about 80%, the prediction speed is greatly improved, and the whole algorithm is more efficient and accurate. After the power utilization classification model is used for classifying users in the area covered by each power distribution node, the power supply time and the power supply quantity are adjusted, and the power supply utilization rate is effectively guaranteed.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The first embodiment is as follows: the present embodiment is described below with reference to fig. 1, and the intelligent power distribution method based on the large power data in the present embodiment includes:
step one, acquiring historical electricity utilization data of all electricity utilization users, and preprocessing the historical electricity utilization data; obtaining a sample set;
secondly, clustering the sample set by using an AP clustering algorithm based on dynamic time programming similarity to obtain n types of samples; wherein n is a positive integer;
respectively drawing power consumption curve graphs of the n types of samples, and dividing users corresponding to the power consumption data into three types of non-migratory birds, full-family migratory birds or non-full-family migratory birds according to the power consumption curve graphs;
fourthly, performing type marking on all sample data; respectively marked as non-migratory household electricity, whole-family migratory household electricity or non-whole-family migratory household electricity;
taking 70% of the labeled sample data as a training set and 30% as a verification set, and establishing a power utilization classification model by utilizing a K-neighborhood classification algorithm;
step six, sending the last annual power consumption data of each household in the area to be distributed to the power consumption classification model, and obtaining the power consumption type of each user;
seventhly, acquiring the type proportion of the users covered in each sub-area distributed by the power distribution node, and when the sum of the proportion of the whole-family type migrant bird family users covered in any sub-area and the proportion of the non-whole-family type migrant bird family users covered in any sub-area is more than 60%, respectively drawing annual power utilization curves of each whole-family type migrant bird family and each non-whole-family type migrant bird family, and acquiring a daily total power consumption time-varying curve of the sub-area in the last year;
when the proportion of the non-migratory bird family users covered in any sub-area is more than 60%, acquiring the average power consumption of each user per day; acquiring the average daily electricity consumption of the last year in the sub-area;
estimating the total power consumption of the subarea in the next year according to the change curve of the total power consumption of the subarea in the previous year along with time or the average daily power consumption according to the type proportion of the covered users in each subarea, and acquiring a power distribution strategy of the subarea in the next year;
the next annual power distribution strategy of the power distribution subareas with the proportion of non-migratory bird family users larger than 60% is as follows: the daily distribution amount of the sub-area is carried out by taking the average daily power consumption of the previous year as a reference;
the power distribution sub-area next year power distribution strategy is that the proportion of the whole family type migratory bird family users and the non-whole family type migratory bird family users is more than 60 percent: and adjusting the daily distribution quantity of the sub-area according to a curve of the daily total electricity consumption of the sub-area along with the change of time in the last year.
The daily distribution quantity of the sub-area is adjusted by taking the change curve of the daily total power consumption of the sub-area along with time in the last year as a reference, or the average daily total power consumption of the sub-area in the last year is taken as a reference, and the distribution quantity is evaluated and allocated in combination with the current environment change or whether a large-scale market or an electricity industry enterprise increases or decreases.
Furthermore, in this embodiment, the method further includes a step of comparing the power consumption curve of the previous month with the power consumption curve of the month corresponding to the previous year, where in the step, if the power consumption curve of the previous month is different from the power consumption curve of the month corresponding to the previous year, the power consumption curve change trend of the next month is predicted according to the curve change point and the curve trend of the change point, and the sub-area power distribution curve of the next month is adjusted;
and if the daily average power consumption of the previous month is different from the daily average power consumption of the month corresponding to the previous year, acquiring a power consumption change curve of the previous month, predicting the daily average power distribution of the next month, and adjusting the daily average power distribution of the sub-area.
In the embodiment, a monthly power consumption curve and an annual power consumption curve are respectively drawn for each whole family type migratory bird family and non-whole family type migratory bird family; acquiring power utilization thresholds of all-family waiting families and non-all-family waiting families, acquiring leaving and returning times of family members in all-family waiting families and non-all-family waiting families according to the monthly power utilization curve, the annual power utilization curve and the power utilization thresholds, and acquiring power utilization peak periods and power utilization valley periods of non-waiting family users according to the power utilization change curves of the non-waiting family users;
when the proportion of non-migratory bird family users in an area covered by distributed electric energy in a power distribution node exceeds 60%, extracting the peak electricity utilization time, the underestimation time, the highest electricity consumption and the lowest electricity consumption of all the non-migratory bird family users, wherein the peak electricity utilization time and the underestimation time are used as the peak electricity utilization time and the valley electricity utilization time of the users in the area covered by the node; the node distribution amount is adjusted at regular time by utilizing the electricity utilization peak time and the electricity utilization valley time;
when the coverage area of the power distribution node contains the whole-family waiting bird families or the non-whole-family waiting bird families, the most time period of the empty nests and the most time period of the users are obtained, and the power distribution amount of the coverage area of the node is adjusted.
The method aims to adjust the power supply amount and the power supply peak and valley time of the area according to the power utilization type of users in the area, reduce resource waste and ensure efficient energy utilization. The traditional K-neighbor algorithm is large in calculation amount, when samples are unbalanced, the prediction accuracy rate of rare categories is low, a large amount of memory is needed, a dynamic time programming algorithm is introduced on the basis, the DTW algorithm is found after multiple attempts, the optimal alignment between two observation sequences is found by distorting the time dimension under certain constraint conditions, and the main problem of how to calculate the distance between the two time sequences can be well solved.
In the initial stage of the K-neighborhood prediction algorithm, in order to obtain a better training data set, unsupervised learning is performed on data which is not classified, an Affinity Prediction (AP) clustering algorithm based on DTW similarity is selected, and a time series is a common representation form of the data. For time series processing, a common task is to compare the similarity of two sequences. DTW calculates the similarity between two time series by extending and shortening the time series. The conventional method is used to calculate the euclidean distance between two sequences, i.e. the sum of the distances between corresponding points of two sequences, and the method of corresponding a point at a certain time to a plurality of points at successive times of another time in a sequence is called time warping. According to the obtained dynamic time planning matrix between every two training set samples, the distance between the two matrixes is calculated to serve as the similarity, all points are divided into any types according to the similarity, and finally under the operation of manual participation, the types of the populations are divided into three types, namely non-migratory birds, whole-family migratory birds and non-whole-family migratory birds, so that a large number of labeled training data sets can be obtained, and the training data sets are used for training and verifying the models.
Further, in the invention, in the first step, the historical electricity utilization data is preprocessed; the specific method for acquiring the sample data comprises the following steps:
a1, filtering historical electricity consumption data of all users in an area to be distributed with electricity, and eliminating users with zero electricity consumption; acquiring primary screening data;
a2, eliminating abnormal values in the primary screening data through a quartile method;
and A3, performing linear transformation on the data with the abnormal values removed through dispersion standardization, mapping the power consumption data of each user to [0-1], and acquiring sample data.
Further, in the present invention, a specific formula for mapping the power consumption data of each user to [0-1] is:
Figure BDA0003488170370000071
wherein x represents normalized electric quantity data of a user, xiRepresenting the ith electrical quantity data, minx, of the useriFor the minimum value of the user's electricity data, maxxiAnd the maximum value of the user electric quantity data is obtained.
Further, in the present invention, in the second step, the specific method for obtaining n kinds of cluster data by using the AP clustering algorithm based on the dynamic time programming similarity includes:
step B1, representing the sample set into a time sequence, calculating the distance between each point of the two time sequences, and obtaining a distance matrix;
step B2, finding a path from the upper left corner to the lower right corner in the distance matrix to ensure that the sum of elements on the path is minimum, and obtaining a DTW matrix between every two sequences in the sample set;
b3, calculating a similarity matrix by using a DTW matrix between every two sequences in the sample set;
and step B4, using the similarity matrix as the input of the AP clustering algorithm to obtain the clustering result in n.
Further, in the third step, the specific method for dividing the users corresponding to the power consumption data into three types of non-migratory birds, whole-family migratory birds or non-whole-family migratory birds includes:
and respectively drawing power consumption curve graphs of the n types of samples, and respectively setting the n types of samples as a whole-family type waiting bird, a non-whole-family type waiting bird and a non-waiting bird according to the power consumption characteristics of the whole-family type waiting bird, the non-whole-family type waiting bird and the non-waiting bird.
Further, in the fifth step of the present invention, the method for establishing the electricity utilization classification model by using the K-nearest neighbor algorithm comprises:
step C1, taking part of the labeled sample data as a training set, and establishing an electricity utilization classification model by utilizing a K proximity algorithm;
and step C2, using the other part of the labeled sample data as a verification sample set, verifying the electricity utilization classification model established in the step I, adjusting the threshold value K of the electricity utilization classification model until the accuracy rate of the electricity utilization classification model reaches 80%, and obtaining a K-neighborhood algorithm to establish the electricity utilization classification model.
Further, in the present invention, in step C2, the method for adjusting the threshold K of the electricity classification model includes:
step D1, utilizing dynamic time planning to regulate the time axis of the time sequence in the training sample set, and calculating the distance between each test point and the time sequence point in the training sample set:
d2, selecting k points with the minimum distance from the test points, counting the occurrence frequency of the category where the k points are located, and taking the frequency with the highest occurrence frequency as the category of the test points;
and D3, judging whether the type of the test point is correct or not according to the label of the test point, recording a judgment result, continuously testing the model by using the verification set until the verification is finished, judging whether the accuracy of the electricity utilization classification model reaches 80%, if so, taking the current k value as the threshold value of the electricity utilization classification model, otherwise, taking k as k-1, returning to execute the step D1 until the accuracy of the electricity utilization classification model reaches 80%, and finishing the adjustment of the threshold value k of the electricity classification model.
The specific embodiment is as follows:
fig. 1 shows a flow chart of a K-neighborhood algorithm based on a dynamic time planning algorithm, which specifically includes the following steps:
step 1: and cleaning the electric quantity of the original user, and eliminating abnormal values in the electric quantity. And carrying out data processing and data mining by using data processing modes such as single hot and standardization. And performing characteristic engineering on the data to obtain the characteristics of the upper and lower quartiles, the mean value, the variance, the covariance and the like of the data.
Step 1.1: and filtering the annual zero-electricity users.
Step 1.2: abnormal data is removed through a quartile method, so that normal fluctuation of the data is obvious.
Step 1.3: and performing linear transformation on the original data through dispersion standardization to map the electric quantity data value of each national network user to [0-1 ]. The formula is as follows.
Figure BDA0003488170370000081
Wherein x represents the normalized electric quantity data, xi represents the ith electric quantity data of the user, min is the minimum value of the electric quantity data of the user, and max is the maximum value of the electric quantity data of the user.
Step 2: unsupervised learning is performed on unlabeled datasets, and a time series is a common representation of data using an AP clustering algorithm based on dynamic time programming similarity. For time series processing, a common task is to compare the similarity of two sequences. Dynamic time planning calculates the similarity between two time series by extending and shortening the time series. Step 2.1: a distance matrix between the points of the two sequences is calculated.
Step 2.2: and searching a path from the upper left corner to the lower right corner of the matrix to minimize the sum of elements on the path, thereby obtaining the DTW matrix between every two training set samples.
Step 2.3: the negative value of the distance matrix is used as a similarity matrix (the larger the distance is, the smaller the negative value is, the smaller the similarity is), and is used as an input of the AP cluster.
Step 2.4: and performing a clustering algorithm, screening different types of user data according to a clustering result, drawing a power consumption curve chart of the user data, and determining the power consumption types of the user data to be three types of non-migratory birds, full-family migratory birds and non-full-family migratory birds according to the curve chart by manual participation.
Step 2.5: and sorting the determined user data of the three power utilization types and making a label.
And step 3: and fusing the dynamic time planning model and the traditional K-neighborhood algorithm model.
Step 3.1: first, we still use the distance between each pair of "points" in the two sequences to calculate the similarity, even though the number of points in the two sequences may not be the same. However, because the time axis can be warped, we do not take a pair of points in sequence in two sequences to calculate distance, but each point is likely to correspond to multiple points in another sequence. Combining continuity and monotonicity constraints, the path of each grid point has only three directions. Starting from point (0,0), the two sequences Q and C are matched, and every time a point is reached, the distances calculated for all the previous points are accumulated. After reaching the end point (n, m), the cumulative distance is the final total distance, i.e., the similarity of the sequences Q and C. The cumulative distance γ (i, j) is expressed in the following manner, and the cumulative distance γ (i, j) is the current grid point distance d (i, j), that is, the point qiAnd cjEuclidean distance (similarity) d (q)i,cj) Sum of cumulative distances to the smallest neighboring element that can reach the point:
γ(i,j)=d(qi,cj)+min{γ(i-1,j-1),γ(i-1,j),γ(i,j-1)}
the optimal path is the path that minimizes the cumulative distance along the path.
Step 3.2: sorting according to the ascending order of the distance, selecting k points with the smallest distance from the current point, counting the occurrence frequency of the category where the k points are positioned, and returning the category with the highest occurrence frequency of the k points as the prediction classification of the current point.
Step 3.3: the data set is pre-divided into a 70% training set and a 30% testing set, the model is trained by using the data with the labels after the previous clustering, and the obtained model is used for the testing set to obtain the accuracy of model prediction so as to judge the quality of the model.
Step 3.4: before the accuracy does not reach an ideal value, a plurality of training sets are used for training, the K value and the window value are adjusted, and the original K proximity algorithm model is optimized, wherein the specific optimization mode is as follows:
1. and weighting the user data with complete data and obvious characteristics in the original data set, and increasing the influence of the user data on the model during training.
2. And randomly dividing the training set into K parts, taking one part as a verification set evaluation model, taking the other K-1 parts as training set training models, repeating the step K times, taking a different subset as the verification set each time, finally obtaining K different models and K scores, and integrating the performance (average score or other) of the K models to evaluate the quality of the models in the current problem.
And 4, step 4: and after the final electric quantity classification model is obtained, carrying out target identification on the power consumption curve, and outputting the outgoing time and the returning time.
Step 4.1: excluding data classified as non-migratory birds from the data.
Step 4.2: drawing a power consumption curve graph of the whole-family and non-whole-family waiting bird data, calculating a threshold value according to the basic power consumption condition of the user through the identification of the curve graph and the mining of the data, and finding the valley of the power consumption according to the threshold value.
Step 4.3: and judging the valley period of the power consumption, if the valley period of the power consumption is more than 30 days, taking the period as the outgoing time, and outputting the starting time and the ending time of the valley, namely the outgoing time and the home returning time of the user.
Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims. It should be understood that features described in different dependent claims and herein may be combined in ways different from those described in the original claims. It is also to be understood that features described in connection with individual embodiments may be used in other described embodiments.

Claims (8)

1. The intelligent power distribution method based on the electric power big data is characterized by comprising the following steps:
step one, acquiring historical electricity utilization data of all electricity utilization users, and preprocessing the historical electricity utilization data; acquiring a sample set;
secondly, clustering the sample set by using an AP clustering algorithm based on dynamic time programming similarity to obtain n types of samples; wherein n is a positive integer;
respectively drawing power consumption curve graphs of the n types of samples, and dividing users corresponding to the power consumption data into three types of non-migratory birds, full-family migratory birds or non-full-family migratory birds according to the power consumption curve graphs;
fourthly, performing type marking on all sample data; respectively marked as non-migratory household electricity, whole-family migratory household electricity or non-whole-family migratory household electricity;
taking 70% of the labeled sample data as a training set and 30% as a verification set, and establishing a power utilization classification model by utilizing a K-neighborhood classification algorithm;
step six, sending the last annual power consumption data of each household in the area to be distributed to the power consumption classification model, and obtaining the power consumption type of each user;
seventhly, acquiring the type proportion of the users covered in each sub-area distributed by the power distribution node, and when the sum of the proportion of the whole-family type migrant bird family users covered in any sub-area and the proportion of the non-whole-family type migrant bird family users covered in any sub-area is more than 60%, respectively drawing annual power utilization curves of each whole-family type migrant bird family and each non-whole-family type migrant bird family, and acquiring a daily total power consumption time-varying curve of the sub-area in the last year;
when the proportion of the non-migratory bird family users covered in any subarea is more than 60%, acquiring the average power consumption of each user every day; acquiring the average daily electricity consumption of the last year in the sub-area;
according to the type proportion of the covered users in each subregion, estimating the total electricity consumption of the subregion in the next year according to a curve of the total electricity consumption of the subregion in the previous year changing along with time or average daily electricity consumption, and acquiring a power distribution strategy corresponding to the next year of the subregion;
the next annual power distribution strategy of the power distribution subareas with the proportion of non-migratory bird family users larger than 60% is as follows: carrying out daily distribution of the sub-area by taking the average daily total electricity consumption of the previous year as a reference;
the power distribution sub-area next year power distribution strategy is that the proportion of the whole family type migratory bird family users and the non-whole family type migratory bird family users is more than 60 percent: and adjusting the daily distribution quantity of the sub-area according to a curve of the daily total electricity consumption of the sub-area along with the change of time in the last year.
2. The intelligent power distribution method based on the electric power big data as claimed in claim 1, further comprising a step of comparing the power consumption curve of the previous month with the power consumption curve of the month corresponding to the previous year, wherein in the step, if the power consumption curve of the previous month is different from the power consumption curve of the month corresponding to the previous year, the change trend of the power consumption curve of the next month is predicted according to the curve change point and the curve trend of the change point, and the sub-area power distribution curve of the next month is adjusted;
and if the average daily power consumption of the previous month is different from the average daily power consumption of the corresponding month in the previous year, acquiring a power consumption change curve of the previous month, predicting the average daily power distribution of the next month, and adjusting the average daily power distribution of the sub-area.
3. The intelligent power distribution method based on the electric power big data is characterized in that in the first step, the historical power utilization data are preprocessed; the specific method for acquiring the sample data comprises the following steps:
step A1, filtering historical electricity consumption data of all users in the area to be distributed with electricity, and removing users with zero electricity consumption; acquiring primary screening data;
a2, eliminating abnormal values in the primary screening data through a quartile method;
and A3, performing linear transformation on the data with the abnormal values removed through dispersion standardization, mapping the power consumption data of each user to [0-1], and acquiring sample data.
4. The intelligent power distribution method based on the electric power big data as claimed in claim 2, wherein in step a3, the specific formula for mapping the power consumption data of each user to [0-1] is:
Figure FDA0003488170360000021
wherein x represents normalized electric quantity data of a user, xiRepresenting the ith electrical quantity data, minx, of the useriFor the minimum value of the user's electricity data, maxxiAnd the maximum value of the user electric quantity data is obtained.
5. The intelligent power distribution method based on the big power data as claimed in claim 3, wherein in the second step, the specific method for obtaining n kinds of cluster data by using the AP clustering algorithm based on the similarity of dynamic time planning is as follows:
step B1, representing the sample set into a time sequence, calculating the distance between each point of the two time sequences, and obtaining a distance matrix;
step B2, finding a path from the upper left corner to the lower right corner in the distance matrix to ensure that the sum of elements on the path is minimum, and obtaining a DTW matrix between every two sequences in the sample set;
b3, calculating a similarity matrix by using a DTW matrix between every two sequences in the sample set;
and step B4, using the similarity matrix as the input of the AP clustering algorithm to obtain n kinds of clustering results.
6. The intelligent power distribution method based on the electric power big data according to claim 1, wherein in the third step, the specific method for dividing the users corresponding to the power consumption data into three types, namely non-migratory birds, whole-family migratory birds and non-whole-family migratory birds, is as follows:
and respectively drawing power consumption curve graphs of the n types of samples, and respectively setting the n types of samples as a whole-family type waiting bird, a non-whole-family type waiting bird and a non-waiting bird according to the power consumption characteristics of the whole-family type waiting bird, the non-whole-family type waiting bird and the non-waiting bird.
7. The intelligent power distribution method based on the electric power big data as claimed in claim 1, wherein the method for establishing the electricity classification model by using the K-neighborhood algorithm comprises the following steps:
step C1, taking part of the labeled sample data as a training set, and establishing an electricity utilization classification model by utilizing a K proximity algorithm;
and step C2, using the other part of the labeled sample data as a verification sample set, verifying the electricity utilization classification model established in the step one, adjusting the threshold value K of the electricity utilization classification model until the accuracy rate of the electricity utilization classification model reaches 80%, and obtaining a K-neighborhood algorithm to establish the electricity utilization classification model.
8. The intelligent power distribution method based on the big power data as claimed in claim 6, wherein in step C2, the method for adjusting the threshold K of the power utilization classification model comprises:
d1, utilizing dynamic time planning to regulate the time axis of the time sequence in the training sample set, and calculating the distance between each test point and the time sequence point in the training sample set;
d2, selecting k points with the minimum distance from the test points, counting the occurrence frequency of the category where the k points are located, and taking the frequency with the highest occurrence frequency as the category of the test points;
and D3, judging whether the type of the test point is correct or not according to the label of the test point, recording a judgment result, continuously testing the model by using the verification set until the verification is finished, judging whether the accuracy of the electricity utilization classification model reaches 80%, if so, taking the current k value as the threshold value of the electricity utilization classification model, otherwise, enabling k to be k-1, returning to execute the step D1 until the accuracy of the electricity utilization classification model reaches 80%, and finishing the adjustment of the threshold value k of the electricity classification model.
CN202210086499.7A 2022-01-25 2022-01-25 Intelligent power distribution method based on electric power big data Pending CN114519651A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210086499.7A CN114519651A (en) 2022-01-25 2022-01-25 Intelligent power distribution method based on electric power big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210086499.7A CN114519651A (en) 2022-01-25 2022-01-25 Intelligent power distribution method based on electric power big data

Publications (1)

Publication Number Publication Date
CN114519651A true CN114519651A (en) 2022-05-20

Family

ID=81596036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210086499.7A Pending CN114519651A (en) 2022-01-25 2022-01-25 Intelligent power distribution method based on electric power big data

Country Status (1)

Country Link
CN (1) CN114519651A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115982577A (en) * 2023-03-20 2023-04-18 山东华网合众信息技术有限公司 Intelligent electricity consumption real-time monitoring method and system
CN116307295A (en) * 2023-05-22 2023-06-23 南京宝能科技有限公司 Intelligent energy digital management system and method applied to cloud platform

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115982577A (en) * 2023-03-20 2023-04-18 山东华网合众信息技术有限公司 Intelligent electricity consumption real-time monitoring method and system
CN115982577B (en) * 2023-03-20 2023-09-08 山东华网合众信息技术有限公司 Intelligent electricity utilization real-time monitoring method and system
CN116307295A (en) * 2023-05-22 2023-06-23 南京宝能科技有限公司 Intelligent energy digital management system and method applied to cloud platform
CN116307295B (en) * 2023-05-22 2023-08-04 南京宝能科技有限公司 Intelligent energy digital management system and method applied to cloud platform

Similar Documents

Publication Publication Date Title
WO2022135265A1 (en) Failure warning and analysis method for reservoir dispatching rules under effects of climate change
CN114519651A (en) Intelligent power distribution method based on electric power big data
CN104809658B (en) A kind of rapid analysis method of low-voltage distribution network taiwan area line loss
CN111784093B (en) Enterprise reworking auxiliary judging method based on power big data analysis
CN111680764B (en) Industry reworking and production-resuming degree monitoring method
CN109657891B (en) Load characteristic analysis method based on self-adaptive k-means + + algorithm
CN110119948B (en) Power consumer credit evaluation method and system based on time-varying weight dynamic combination
CN108805213B (en) Power load curve double-layer spectral clustering method considering wavelet entropy dimensionality reduction
CN111242161B (en) Non-invasive non-resident user load identification method based on intelligent learning
CN111160617A (en) Power daily load prediction method and device
CN112819299A (en) Differential K-means load clustering method based on center optimization
CN113515512A (en) Quality control and improvement method for industrial internet platform data
WO2020024444A1 (en) Group performance grade recognition method and apparatus, and storage medium and computer device
CN113935557A (en) Same-mode energy consumption big data prediction method based on deep learning
CN116821832A (en) Abnormal data identification and correction method for high-voltage industrial and commercial user power load
CN110610121A (en) Small-scale source load power abnormal data identification and restoration method based on curve clustering
CN115018200A (en) Power load prediction method and system based on deep learning and considering multiple influence factors
CN111126499A (en) Secondary clustering-based power consumption behavior pattern classification method
CN111046913A (en) Load abnormal value identification method
CN112508254B (en) Method for determining investment prediction data of transformer substation engineering project
CN114266321A (en) Weak supervision fuzzy clustering algorithm based on unconstrained prior information mode
CN109858667A (en) It is a kind of based on thunder and lightning weather to the short term clustering method of loading effects
CN107274025B (en) System and method for realizing intelligent identification and management of power consumption mode
CN117076691A (en) Commodity resource knowledge graph algorithm model oriented to intelligent communities
CN116470491A (en) Photovoltaic power probability prediction method and system based on copula function

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination