CN113344073A

CN113344073A - Daily load curve clustering method and system based on fusion evolution algorithm

Info

Publication number: CN113344073A
Application number: CN202110613240.9A
Authority: CN
Inventors: 覃日升; 李胜男; 况华; 姜訸; 段锐敏
Original assignee: Electric Power Research Institute of Yunnan Power Grid Co Ltd
Current assignee: Electric Power Research Institute of Yunnan Power Grid Co Ltd
Priority date: 2021-06-02
Filing date: 2021-06-02
Publication date: 2021-09-03

Abstract

The application belongs to the technical field of power system analysis and control, and provides a daily load curve clustering method and system based on a fusion evolutionary algorithm, wherein the daily load curve clustering method based on the fusion evolutionary algorithm comprises the following steps: acquiring and preprocessing original daily load curve loads of a plurality of users, and randomly generating S genetic algorithm individuals to form an initialization population, wherein the individuals are formed by clustering center codes; updating the cluster center code of each individual according to a fuzzy C-means algorithm, obtaining a membership matrix of the data object to the cluster center, selecting a fitness function, and calculating an individual fitness value; circularly repeating the genetic operation of individuals in the population and the cluster center code updating operation of the individuals until the current annealing temperature is less than the annealing termination temperature; and selecting the cluster center code value of the individual with the maximum fitness value in the population to determine the cluster center as the final cluster center. The daily load curve clustering method based on the fusion evolutionary algorithm effectively improves the accuracy of the daily load curve clustering method.

Description

Daily load curve clustering method and system based on fusion evolution algorithm

Technical Field

The application belongs to the technical field of power system analysis and control, and particularly relates to a daily load curve clustering method and system based on a fusion evolutionary algorithm.

Background

With the continuous promotion of smart power grid construction, the data acquisition equipment can collect the power utilization condition of a large number of users. Different types of users, such as civilian, commercial, industrial, and agricultural, have large differences in power consumption patterns, and their power patterns may differ even for the same type of user. How to adopt an effective data mining technology and finely divide mass user load curve data of different types under the background of big data so as to mine the information such as internal relation among loads of different types, corresponding power utilization behavior and power utilization characteristics and the like, and undoubtedly, the method has certain guiding significance on load prediction, power grid planning and demand side response.

The traditional daily load curve clustering method mainly comprises a direct clustering method based on original load data and an indirect clustering method based on dimension reduction. The direct clustering method generally includes normalizing the load values of each sampling time point of the daily load curve, and clustering by using algorithms such as a K mean value, a fuzzy C mean value, self-organizing map and the like. The fuzzy C-means algorithm is a fuzzy clustering method based on division, objective classification is carried out on membership degrees of different classes through description samples, the algorithm is simple and fast in searching speed, but clustering results excessively depend on an initial clustering center, the clustering results are easy to converge on local extreme points and fall into local optimal solutions, and daily load curve classification results are deviated.

In order to overcome the defects of the fuzzy C-means algorithm, the fuzzy C-means algorithm can be improved by combining with the genetic algorithm, for example, a fuzzy C-means operator can be used for replacing a crossover operator in the genetic algorithm, a hybrid genetic clustering algorithm is provided, a floating point coding mode of a clustering center can be adopted, and a floating point number crossover and mutation algorithm is designed to improve the search efficiency.

However, when the number of samples, the sample dimension, and the number of sample classes are large, these algorithms often suffer from premature convergence to local excellence. When the algorithm is premature, the local extreme advantage is difficult to jump out by only depending on small mutation probability. Moreover, the evolutionary algorithm may generate a degradation phenomenon in the evolutionary process, which may result in too long iteration times and low clustering accuracy.

Disclosure of Invention

The application provides a daily load curve clustering method and system based on a fusion evolutionary algorithm, and provides the daily load curve clustering method and system with higher accuracy.

The first aspect of the application provides a daily load curve clustering method based on a fusion evolutionary algorithm, and the daily load curve clustering method based on the fusion evolutionary algorithm comprises the following steps:

step 1: acquiring original daily load curve loads of a plurality of users, preprocessing the original daily load curve loads to obtain a load data set, wherein the load data set consists of a plurality of data objects, and one data object represents the load of one original daily load curve;

step 2: initializing the current annealing temperature and the annealing termination temperature of a simulated annealing algorithm, initializing a genetic algorithm individual based on the number C of cluster pre-classified by a load data set, and randomly generating S individuals to form an initialized population, wherein the individuals are formed by C cluster center codes;

and step 3: updating the cluster center code of each individual according to a fuzzy C-means algorithm, obtaining a membership matrix of the data object relative to the cluster center, selecting a fitness function, and calculating an individual fitness value;

and 4, step 4: carrying out genetic operation on individuals in the population according to a genetic algorithm, updating the cluster center codes of the individuals in the population according to the current annealing temperature and the individual fitness value, and then carrying out cooling operation on the current annealing temperature to obtain the updated current annealing temperature;

and 5: repeating the step 4 until the updated current annealing temperature is less than the annealing termination temperature;

step 6: and selecting the individual with the maximum fitness value in the population as an optimal individual, and determining the clustering center code value of the optimal individual as the final C clustering centers.

Optionally, the step of obtaining the original daily load curve loads of the multiple users, preprocessing the original daily load curve loads, and obtaining the load data set specifically includes:

searching missing and abnormal data in the load of each original daily load curve, wherein the abnormal data comprises data with sudden drop, sudden increase or negative value, and if the load abnormal data of the original daily load curve reaches 10% of the acquisition amount, removing the original daily load curve to obtain first spare load data;

supplementing and correcting missing and abnormal data in the first spare load data to obtain second spare load data;

and performing normalization processing on the second spare load data by adopting a linear function normalization method to obtain a load data set.

Optionally, the supplementing and correcting missing data and abnormal data in the first spare load data adopts a barycentric lagrangian interpolation method, and the barycentric lagrangian interpolation method defines a lagrangian interpolation basis function according to a barycentric weight.

Optionally, the individuals are coded by C cluster centers and binary coding is adopted.

Optionally, the cluster center code of each individual is updated according to the fuzzy C-means algorithm, and the membership function adopted to obtain the membership matrix of the data object relative to the cluster center is:

in the formula u_ikThe membership degree of the ith data object belonging to the kth class, c is the number of clustering centers, d_ikThe distance from the ith data object to the kth class is defined, and r is a fuzzy index;

the fitness function is selected, and the fitness function adopted for calculating the individual fitness value is as follows:

f_i＝ranking(J_r)；

in the formula (f)_iRepresenting the fitness value of the ith individual in the population, ranking () is a ranking-based distribution function, J_rComprises the following steps:

wherein U is membership matrix, V is clustering center matrix, c is clustering center number, n-x is data object number, U_ikDegree of membership, d, for the ith data object belonging to the kth class_ikIs the distance from the ith data object to the kth class, and r is the fuzzy index.

Optionally, the performing genetic operation on individuals in the population according to a genetic algorithm, updating the cluster center codes of the individuals in the population according to the current annealing temperature and the individual fitness value, and then performing a cooling operation on the current annealing temperature to obtain an updated current annealing temperature specifically includes:

step 601: selecting, crossing and mutating the individuals in the population to generate new individuals;

step 602: calculating a fitness value of a new individual, if the fitness value of the new individual is greater than or equal to the fitness value of the individual in the population, updating the cluster center code value of the individual in the population by using the cluster center code value of the new individual, and if the fitness value of the new individual is less than the fitness value of the individual in the population, updating the cluster center code value of the individual in the population by using the cluster center code value of the new individual according to a preset probability, wherein the preset probability is as follows:

in the formula (f)_i' is the fitness value of a New individual, f_iIs the fitness value of an individual in the population, and T is the current annealing temperature;

step 603: repeating the steps 601 to 602 until the cycle number is larger than the set maximum cycle number;

step 604: updating the current annealing temperature according to a cooling formula, wherein the cooling formula is as follows:

T_i+1＝p×T_i；

in the formula, T_i+1Is an updated current annealing temperature value, T_iAnd p is the current annealing temperature value and the cooling coefficient.

Optionally, in the step of performing selection, crossing and mutation genetic operations on individuals in the population, the selection operator adopts random traversal sampling, the crossing operator adopts a multipoint crossing operator, and the mutation operator adopts a base bit mutation operator.

The second aspect of the present application provides a daily load curve clustering system based on a fusion evolutionary algorithm, where the daily load curve clustering system based on the fusion evolutionary algorithm is used to execute the daily load curve clustering method based on the fusion evolutionary algorithm provided by the first aspect of the present application, and the daily load curve clustering system based on the fusion evolutionary algorithm includes:

the data acquisition module is used for acquiring the original daily load curve loads of a plurality of users;

the data preprocessing module is used for preprocessing the original daily load curve load to obtain a load data set;

the initialization module is used for initializing the current annealing temperature and the annealing termination temperature of the simulated annealing algorithm, initializing the genetic algorithm individuals and randomly generating S individuals to form an initialization population, wherein the individuals are formed by cluster center codes;

the fuzzy C mean module is used for updating the cluster center code of each individual, obtaining a membership matrix of the data object relative to the cluster center, selecting a fitness function and calculating an individual fitness value;

the genetic annealing module is used for carrying out genetic operation on individuals in the population, updating the cluster center codes of the individuals in the population according to the individual fitness value, then carrying out cooling operation on the current annealing temperature to obtain the updated current annealing temperature, and judging whether the updated current annealing temperature is less than the annealing termination temperature or not;

and the screening module is used for selecting the individual with the maximum fitness value in the population as the optimal individual, and the clustering center code value of the optimal individual is determined as the final C clustering centers.

Optionally, the data preprocessing module specifically includes:

the data cleaning unit is used for searching missing data and abnormal data in the load of each original daily load curve, the abnormal data comprise data with sudden drop, sudden increase or negative values, and if the load abnormal data of the original daily load curve reach 10% of the collection amount, the original daily load curve is removed to obtain first spare load data;

the data interpolation unit is used for supplementing and correcting missing data and abnormal data in the first spare load data by adopting a gravity center Lagrange interpolation method to obtain second spare load data;

and the data normalization unit is used for performing normalization processing on the second spare load data by adopting a linear function normalization method to obtain a load data set.

Optionally, the genetic annealing module specifically comprises:

the genetic operation unit is used for carrying out selection, crossing and variant genetic operation on individuals in the population to generate new individuals;

a fitness screening unit, configured to calculate a fitness value of a new individual, update a cluster center code value of the individual in the population with the cluster center code value of the new individual if the fitness value of the new individual is greater than or equal to the fitness value of the individual in the population, and update the cluster center code value of the individual in the population with the cluster center code value of the new individual according to a preset probability if the fitness value of the new individual is less than the fitness value of the individual in the population, where the preset probability is:

the circulation judging module is used for judging whether the circulation times are larger than the set maximum circulation times or not;

the annealing unit is used for updating the current annealing temperature according to a cooling formula, wherein the cooling formula is as follows:

T_i+1＝p×T_i；

The application provides a daily load curve clustering method and system based on a fusion evolutionary algorithm, wherein the daily load curve system based on the fusion evolutionary algorithm is used for executing the steps of the daily load curve clustering method based on the fusion evolutionary algorithm, acquiring original daily load curve loads of a plurality of users, preprocessing the original daily load curve loads to obtain a load data set, initializing the current annealing temperature and the annealing termination temperature of a simulated annealing algorithm, initializing genetic algorithm individuals, and randomly generating S individuals to form an initialization population, wherein the individuals are formed by clustering center codes; updating the cluster center code of each individual according to a fuzzy C-means algorithm, obtaining a membership matrix of the data object relative to the cluster center, selecting a fitness function, and calculating an individual fitness value; circularly repeating the genetic operation of individuals in the population and the cluster center code updating operation of the individuals until the updated current annealing temperature is less than the annealing termination temperature; and selecting the individual with the maximum fitness value in the population as an optimal individual, and determining the clustering center code value of the optimal individual as the final C clustering centers.

According to the daily load curve clustering method based on the fusion evolutionary algorithm, the fuzzy C-means algorithm, the genetic algorithm and the simulated annealing algorithm are combined to update the clustering center, so that the phenomenon that the daily load curve clustering method falls into local optimization is effectively avoided, and the accuracy of the daily load curve clustering method is improved.

Drawings

In order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a daily load curve clustering method based on a fusion evolutionary algorithm provided in an embodiment of the present application.

Fig. 2 is a schematic structural diagram of a daily load curve clustering system based on a fusion evolutionary algorithm provided in the embodiment of the present application.

Fig. 3 is a comparison diagram before and after filling data by the barycentric lagrangian interpolation method according to the embodiment of the present application.

Fig. 4 is a daily load curve of different industries after normalization by the embodiment of the application.

Fig. 5 is a daily load curve clustering result according to the embodiment of the present application.

Fig. 6 shows the daily load curve classification result according to the embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments.

As shown in fig. 1, a schematic flow chart of a daily load curve clustering method based on a fusion evolutionary algorithm provided in the embodiment of the present application is shown, where the daily load curve clustering method based on the fusion evolutionary algorithm includes steps 1 to 6.

Step 1, acquiring original daily load curve loads of a plurality of users, preprocessing the original daily load curve loads, and acquiring a load data set.

Many clustering algorithms are sensitive to abnormal and missing data, and abnormal data in load data may affect the clustering effect and generate wrong classification, so that the load data needs to be preprocessed. The loss of the load and the generation of abnormal data are caused by various reasons, firstly, the data loss may be caused by the damage and the abnormality of the data measuring device, secondly, the load data may be caused by the normal activities of the power grid such as line maintenance or security inspection, and the data abnormality such as outlier, noise, deviation and the like may be caused by the transmission of the load data from the measuring device to the analysis end. The load data abnormality and loss preprocessing method includes an empirical correction method, a threshold discrimination method, a curve replacement method, and the like.

The method includes the steps of selecting 412 original daily load curves, enabling 96 load sampling points to be arranged on each original daily load curve, enabling the time interval of the sampling points to be 15 minutes, preprocessing load data after the load data are acquired, and specifically including the steps from S101 to S103.

Step S101, missing and abnormal data in each original daily load curve load are searched, the abnormal data comprise data with sudden drop, sudden increase or negative values, if the load abnormal data of the original daily load curves reach 10% of the sampling number, the original daily load curves are considered invalid, the original daily load curves are removed, first spare load data are obtained, for example, n original daily load curves are obtained, if x original daily load curves are invalid, the effective original daily load curves are n-x, and a matrix of (n-x) x m is formed as the first spare load data.

In the 412 original daily load curves of the embodiment of the application, the total number of curves with missing data and abnormal data at 10 sampling points is 12, and the 12 curves are removed for the next operation.

And S102, supplementing and correcting missing data and abnormal data in the first spare load data, and obtaining second spare load data by adopting a gravity lagrange interpolation method.

The lagrangian interpolation method is convenient to apply in theoretical analysis due to the fact that the formula structure is neat and compact, but when interpolation points are increased or decreased, corresponding basis functions need to be recalculated, and the process is complex, so that the gravity center lagrangian interpolation method is provided in the embodiment of the application, and as shown in fig. 3, a comparison graph before and after data is filled in the gravity center lagrangian interpolation method in the embodiment of the application. The gravity center Lagrange interpolation method does not need to calculate a basis function during interpolation calculation, can greatly reduce the calculation amount, and is provided with a polynomial function (x) of k +1 nodes₀,y₀)(x₁,y₁)...(x_k,y_k) Defining the gravity center weight as:

the lagrange basis function can be defined as:

wherein l (x) is (x-x)₀)…(x-x_k)；

The barycentric lagrange interpolation formula is:

in the formula, x_jIs an independent variable, y_jIs a dependent variable.

Step S103, carrying out normalization processing on the second spare load data to obtain a load data set, wherein the load data set is composed of a plurality of data objects, and one data object represents a daily load curve.

The daily load curves are different due to different dimensions of user attributes, and the influence of the dimensions can be eliminated through data normalization processing, so that the analysis result is more accurate. As shown in fig. 4, a linear function normalization method is adopted for daily load curves of different industries after normalization according to the embodiment of the present application, and a linear function normalization formula is as follows:

in formula (II), X'_iFor normalized load data, X_iFor load data before normalization, X_minFor minimum load data before normalization, X_maxThe maximum load data before normalization.

And 2, initializing the current annealing temperature and the annealing termination temperature of the simulated annealing algorithm, initializing the genetic algorithm individuals based on the pre-classified cluster number C of the load data set, and randomly generating S individuals to form an initialization population.

The simulated annealing algorithm is a greedy algorithm and accepts poor solutions under a preset probability. According to the embodiment of the application, the simulated annealing algorithm and the genetic algorithm are combined, so that the local optimal solution can be skipped, the global optimal solution can be found, and the convergence phenomena of early precocity and late evolution stagnation of the genetic algorithm are avoided. In the present embodiment, the current annealing temperature is set to 100, and the annealing end temperature is set to 1.

There are various methods for determining the number of pre-classified clusters of the load data set, such as a gap statistic method, an elbow criterion method, an effectiveness function index, and the like, and the user sample selected in the embodiment of the application is from the industries, businesses, agriculture and education, so that the number C of the initially determined clusters is 4. 4 96-dimensional data objects are randomly generated to serve as 4 initial clustering centers of the load data set, binary coding is adopted, the 4 randomly generated data objects represent individuals of a genetic algorithm, and S individuals are repeatedly and randomly generated to form an initialization population.

And 3, updating the cluster center code of each individual according to a fuzzy C-means algorithm, obtaining a membership matrix of the data object relative to the cluster center, selecting a fitness function, and calculating the fitness value of the individual.

Fuzzy mean clustering fuses the essence of a fuzzy theory, combines the fuzzy theory and the clustering theory, divides samples into a plurality of fuzzy groups according to the internal rules of the samples, determines the similarity between the samples through a distance function, and obtains an optimal clustering result by utilizing a mathematical programming theory. The step of updating the cluster center code of each individual according to the fuzzy C-means algorithm, obtaining a membership matrix of the data object relative to the cluster center, selecting a fitness function, and calculating the individual fitness value specifically comprises the steps of S301 to S303.

Step S301, calculating the membership of the data object relative to each clustering center according to the initial clustering center and the membership function of the individual to obtain the membership matrix of each individual, and updating the clustering center code of the individual according to the membership matrix and the clustering center updating formula, wherein the membership function is as follows:

in the formula u_ikThe membership degree of the ith data object belonging to the kth class, c is the number of clustering centers, d_ikThe Euclidean distance from the ith data object to the kth clustering center is defined, r is a fuzzy index, and the membership degree needs to satisfy the following formula:

the cluster center updating formula is as follows:

in the formula, V_kIs the k-th cluster center, z_iIs the ith data object, u_ikAnd n-x is the number of the data objects, wherein the ith data object belongs to the k-th class of membership degree.

And step S302, obtaining a membership matrix of the data object relative to the updated clustering center according to the membership function.

And after the cluster center is updated, the membership degree of the data object relative to the cluster center is changed, and the membership degree is recalculated to obtain a membership degree matrix to which the updated cluster center belongs.

Step S303, selecting a fitness function, and calculating an individual fitness value, wherein the fitness function is as follows:

f_i＝ranking(J_r)；

in the formula, U is membership momentArray, V is a cluster center matrix, c is the number of cluster centers, n-x is the number of data objects, u_ikDegree of membership, d, for the ith data object belonging to the kth class_ikIs the distance from the ith data object to the kth class, and r is the fuzzy index.

Calculating J according to the updated clustering center and the membership matrix corresponding to the clustering center_rAnd the values are sequenced to obtain fitness values.

And 4, performing genetic operation on the individuals in the population according to a genetic algorithm, updating the cluster center codes of the individuals in the population according to the current annealing temperature and the individual fitness value, and performing cooling operation on the current annealing temperature to obtain the updated current annealing temperature.

Step 401: and carrying out selection, crossover and variant genetic operation on individuals in the population to generate new individuals.

The method comprises the steps of carrying out selection, crossing and mutation genetic operation on S individuals in a population to generate new individuals corresponding to the individuals in the population one by one, wherein a selection operator in the genetic operation adopts random traversal sampling, a multi-point crossing operator is adopted as the crossing operator, the crossing probability is 0.7, a basic bit mutation operator is adopted as the mutation operator, and the mutation probability is 0.01.

Step 402: calculating a fitness value of a new individual, if the fitness value of the new individual is greater than or equal to the fitness value of the individual in the population, updating the cluster center code value of the individual in the population by using the cluster center code value of the new individual, and if the fitness value of the new individual is less than the fitness value of the individual in the population, updating the cluster center code value of the individual in the population by using the cluster center code value of the new individual according to a preset probability, wherein the preset probability is as follows:

in the formula (f)_i' is the fitness value of a New individual, f_iIs the fitness value of an individual in the population, and T is the current annealing temperature.

Step 403: and circularly repeating the steps 401 to 402 until the circulation number is larger than the set maximum circulation number.

Step 404: updating the current annealing temperature according to a cooling formula, wherein the cooling formula is as follows:

T_i+1＝p×T_i；

In the present embodiment, the cooling coefficient is 0.8.

And 5, repeating the step 4 until the updated current annealing temperature is less than the annealing termination temperature.

And 6, selecting the individual with the maximum fitness value in the population as the optimal individual.

And decoding the clustering center code values of the optimal individuals to obtain the final 4 clustering centers, and classifying the data objects according to Euclidean distances. As shown in fig. 5, which is a daily load curve clustering result of the embodiment of the present application, as shown in fig. 6, which is a daily load curve classification result of the embodiment of the present application, wherein the user category i is in a bimodal state, such users are mostly education industries, such industries start to load in the early morning, the load in the morning and afternoon is high, a rest is needed in the noon, and the load is slightly reduced; the user category II is mostly agricultural, most agricultural units run in the daytime, the running period is indefinite, and the time is short, such as irrigation and livestock raising; the user type III is mostly commercial, the load is started at 9 am and is continued to 10 pm, and the commercial operation mode is met; the user class IV is in a peak-flat state, most industries in the same category comprise various large machines, the load of the industries is high, and the industries need to operate all day long to ensure the benefit, so the industries are in a high-load peak-flat state.

The second aspect of the embodiments of the present application provides a daily load curve clustering system based on a fusion evolutionary algorithm, where the daily load curve clustering system based on the fusion evolutionary algorithm is used to execute the daily load curve clustering method based on the fusion evolutionary algorithm provided by the first aspect of the embodiments of the present application, and for details disclosed in the clustering system provided by the second aspect of the embodiments of the present application, please refer to the daily load curve clustering method based on the fusion evolutionary algorithm provided by the first aspect of the embodiments of the present application.

As shown in fig. 2, a schematic structural diagram of a daily load curve clustering system based on a fusion evolutionary algorithm is provided in the embodiment of the present application. The daily load curve clustering system based on the fusion evolutionary algorithm comprises a data acquisition module, a data preprocessing module, an initialization module, a fuzzy C mean value module, a genetic annealing module and a screening module.

And the data acquisition module is used for acquiring the original daily load curve loads of a plurality of users.

And the data preprocessing module is used for preprocessing the original daily load curve load to obtain a load data set.

The initialization module is used for initializing the current annealing temperature and the annealing termination temperature of the simulated annealing algorithm, initializing the genetic algorithm individuals and randomly generating S individuals to form an initialization population, wherein the individuals are formed by cluster center codes.

And the fuzzy C mean module is used for updating the cluster center code of each individual according to a fuzzy C mean algorithm, obtaining a membership matrix of the data object relative to the cluster center, selecting a fitness function and calculating the individual fitness value.

And the genetic annealing module is used for carrying out genetic operation on the individuals in the population, updating the cluster center codes of the individuals in the population according to the individual fitness value, then carrying out cooling operation on the current annealing temperature to obtain the updated current annealing temperature, and judging whether the updated current annealing temperature is less than the annealing termination temperature.

Further, the data preprocessing module specifically includes:

and the data cleaning unit is used for searching missing data and abnormal data in the load of each original daily load curve, the abnormal data comprises data with sudden drop, sudden increase or negative value, and if the load abnormal data of the original daily load curve reaches 10% of the acquisition amount, the original daily load curve is removed to obtain first spare load data.

And the data interpolation unit is used for supplementing and correcting missing data and abnormal data in the first spare load data by adopting a gravity center Lagrange interpolation method to obtain second spare load data.

Further, the genetic annealing module specifically comprises:

and the genetic operation unit is used for carrying out selection, crossing and mutation genetic operation on the individuals in the population to generate new individuals.

And the circulation judging module is used for judging whether the circulation times are larger than the set maximum circulation times.

T_i+1＝p×T_i；

The present application has been described in detail with reference to specific embodiments and illustrative examples, but the description is not intended to limit the application. Those skilled in the art will appreciate that various equivalent substitutions, modifications or improvements may be made to the presently disclosed embodiments and implementations thereof without departing from the spirit and scope of the present disclosure, and these fall within the scope of the present disclosure. The protection scope of this application is subject to the appended claims.

Claims

1. A daily load curve clustering method based on a fusion evolutionary algorithm is characterized by comprising the following steps:

step 1: acquiring load data of original daily load curves of a plurality of users, preprocessing the load data to obtain a load data set, wherein the load data set consists of a plurality of data objects, and one data object represents the load of one original daily load curve;

2. The daily load curve clustering method based on the fusion evolutionary algorithm as claimed in claim 1, wherein the step of obtaining the original daily load curve loads of a plurality of users, preprocessing the original daily load curve loads to obtain a load data set specifically comprises:

searching missing and abnormal data in the load data of each original daily load curve, wherein the abnormal data comprises data with sudden drop, sudden increase or negative value, and if the load abnormal data of the original daily load curve reaches 10% of the acquisition amount, removing the original daily load curve to obtain first spare load data;

3. The method as claimed in claim 2, wherein the supplementing and correcting of missing and abnormal data in the first backup load data is performed by using a barycentric lagrangian interpolation method, and the barycentric lagrangian interpolation method defines lagrangian interpolation basis functions according to the barycentric weights.

4. The daily load curve clustering method based on the fusion evolutionary algorithm as claimed in claim 1, wherein the individuals are coded by C cluster centers by binary coding.

5. The daily load curve clustering method based on the fusion evolutionary algorithm as claimed in claim 1, wherein the cluster center code of each individual is updated according to the fuzzy C-means algorithm, and the membership function adopted to obtain the membership matrix of the data object relative to the cluster center is:

f_i＝ranking(J_r)；

6. The daily load curve clustering method based on the fusion evolutionary algorithm as claimed in claim 1, wherein the steps of performing genetic operation on individuals in a population according to the genetic algorithm, updating cluster center codes of the individuals in the population according to the current annealing temperature and the individual fitness value, and performing a cooling operation on the current annealing temperature to obtain the updated current annealing temperature specifically comprise:

step 604: updating the current annealing temperature according to a cooling formula to obtain the updated current annealing temperature, wherein the cooling formula is as follows:

T_i+1＝p×T_i；

7. The daily load curve clustering method based on the fusion evolutionary algorithm as claimed in claim 6, wherein in the step of selecting, crossing and mutating the individuals in the population, the selection operator adopts random ergodic sampling, the crossing operator adopts a multi-point crossing operator, and the mutation operator adopts a base bit mutation operator.

8. A daily load curve clustering system based on a fusion evolutionary algorithm, wherein the daily load curve clustering system based on the fusion evolutionary algorithm is used for executing the daily load curve clustering method based on the fusion evolutionary algorithm in any one of claims 1 to 7, and comprises the following steps:

the fuzzy C mean value module is used for updating the cluster center code of each individual according to a fuzzy C mean value algorithm, obtaining a membership matrix of the data object relative to the cluster center, selecting a fitness function and calculating an individual fitness value;

the genetic annealing module is used for carrying out genetic operation on individuals in the population, updating the cluster center codes of the individuals in the population according to the individual fitness value, then carrying out cooling operation on the current annealing temperature to obtain the updated current annealing temperature, and judging whether the current annealing temperature is less than the annealing termination temperature or not;

9. The daily load curve clustering system based on the fusion evolutionary algorithm as claimed in claim 8, wherein the data preprocessing module specifically comprises:

the data cleaning unit is used for searching missing data and abnormal data in the load data of each original daily load curve, the abnormal data comprise data with sudden drop, sudden increase or negative values, and if the load abnormal data of the original daily load curve reach 10% of the collection amount, the original daily load curve is removed to obtain first spare load data;

10. The daily load curve clustering system based on the fusion evolutionary algorithm as claimed in claim 8, wherein the genetic annealing module specifically comprises:

T_i+1＝p×T_i；