Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a load identification method of a baseline load model based on a Bayesian classification method.
The purpose of the invention can be realized by the following technical scheme:
A load identification method of a base line load model based on a Bayesian classification method comprises the following steps:
Step 1: determining whether any two given user characteristic quantities are statistically related, and excluding redundant characteristic quantities;
Step 2: forming a training set by the user characteristic quantities which are judged to be statistically correlated and have redundant characteristic quantities eliminated;
And step 3: calculating prior probability aiming at each data category in the training set and further obtaining corresponding Bayesian probability;
and 4, step 4: calculating by using Bayesian probability and combining with the electrical load of the forecast day to obtain the baseline load of the forecast day;
And 5: and identifying the peak-error potential load by using the obtained baseline load.
further, the step 1 specifically includes: and identifying whether any two given characteristic quantities are statistically correlated by using correlation analysis, excluding redundant characteristic quantities, and if the correlation coefficient corresponding to the correlation analysis is greater than a set value, indicating that the correlation degree between the two characteristic quantities needs to delete one of the characteristic quantities.
Further, the formula for calculating the correlation coefficient is as follows:
Wherein r represents a correlation coefficient, yiAn observed value, z, representing the characteristic quantity y on the i-th dayiRepresents the observed value of the characteristic amount z on the i-th day,Represents the average value of the characteristic amount y over all observation days,Represents the average value of the characteristic amount z on all observation days.
Further, the step 2 specifically includes: and forming a prediction tuple X { X1, X2, …, xi } according to the user characteristic quantity measurement values of the prediction day, and dividing the user characteristic quantity measurement values of the observation day into a plurality of training tuples, so that the training tuples and the associated class labels form a training set D.
Further, the formula for calculating the prior probability in step 3 is as follows:
P(Ci)=|ci,D|/|D|
Wherein, | ci,Di represents the training set ciThe number of training tuples of class, | D | represents the total number of training tuples in the training set, P (c)i) Represents a training set cia priori probability of a class.
Further, the bayesian probability in step 3 is calculated as:
In the formula, P (c)i| X) represents a training set cibayesian probability of class, P (X) denotes prior probability of predicting tuple, P (X | C)i) Represents a training set ciThe inverse probability of a class.
Further, the calculation formula of the baseline load of the prediction day in the step 4 is as follows:
In the formula, cs(t) represents the baseline load on the predicted day, ci(t) represents a training set cithe load of the class.
Further, the user feature quantity in step 1 is composed of a distribution power peak shifting right influence element, and the distribution power peak shifting right influence element includes: the power supply reliability requirement grade, load density, peak power consumption ratio, peak-to-peak cost, load rate and annual maximum load utilization hours.
Compared with the prior art, the invention has the following advantages:
(1) The method comprises the following steps of 1: determining whether any two given user characteristic quantities are statistically related, and excluding redundant characteristic quantities; step 2: forming a training set by the user characteristic quantities which are judged to be statistically correlated and have redundant characteristic quantities eliminated; and step 3: calculating prior probability aiming at each data category in the training set and further obtaining corresponding Bayesian probability; and 4, step 4: calculating by using Bayesian probability and combining with the electrical load of the forecast day to obtain the baseline load of the forecast day; and 5: and identifying the peak-shifting potential load by using the obtained baseline load so as to avoid the defect that the predicted daily baseline load is single and is not enough for identifying the peak-shifting potential load.
(2) The method of the invention comprises the following steps that the user characteristic quantity consists of power distribution and power utilization peak shifting right influence elements, and the power distribution and power utilization peak shifting right influence elements comprise: the method has the advantages that the power supply reliability requirement grade, the load density, the peak power consumption ratio, the peak-off cost, the load rate and the annual maximum load utilization hours are required, and finally the diversity of the baseline load data obtained by the method is good.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
the invention is introduced by the relevant basic principle;
The distribution power peak staggering potential influence factors comprise:
(1) Required level of power supply reliability
the power supply reliability refers to the capability of a power system for continuously supplying power, and the requirement on the power supply reliability has a relatively large influence on peak shifting regulation and control, so that the power supply reliability is an important influence factor in peak shifting potential evaluation. The following lists the analysis of power consumption behavior and the requirement for power supply reliability in various industries.
1) light industries
The light industry includes multiple industries such as textile, paper making, bicycles, tobacco, food and the like, the load characteristic of users has larger difference, and the requirements on power supply reliability are not completely the same. The textile industry and other industries basically run continuously in three shifts, the load factor is high, the power supply voltage level reaches 10kV or even higher in large and medium-sized textile factories, the economic loss caused by power failure is large, and the requirement on power supply reliability is high.
2) Heavy industry class
The majority of heavy industry users are continuous production type enterprises, including industries such as metallurgy, machinery, energy (electric power, petroleum, coal, natural gas and the like), chemistry, building materials and the like, and the load rate is high and is basically stable all the year round. Compared with other types of users, the heavy industry type users have the advantages of high power consumption, large scale and higher voltage level of the accessed power grid. According to investigation, a great number of users in the heavy industry are primary or secondary loads, and short-time power failure can not only cause great economic loss, but also even cause personal casualty accidents, so that the requirement on power supply reliability is extremely high.
3) administrative office class
administrative office class load is mainly for the power consumption load such as the illumination of office, group, enterprise and public institution, air conditioner, office equipment, and the load characteristic is strong along with the regularity of the change of work and rest time, and is relatively steady after work every day, and no obvious valley generally appears before work. The administrative office user load is only 1/3 of the time per day over 50% of the maximum load, and holidays are substantially inactive, so the load rate level is low. The administrative office load has obvious seasonal fluctuation and is very sensitive to temperature and climate change, but the administrative office load occupies a small proportion of the total load of a power system and is generally used only in the daytime, so that the administrative office load has little influence on the load rate of a power grid. The electricity consumption scale of administrative office users is generally not large, the power supply voltage level is mostly 0.4kV, and the requirements on power supply reliability are possibly very different due to different user properties.
4) class of business services
the load of the commercial service user is mainly the electricity load such as lighting, air conditioning, power, etc., and can be roughly divided into 2 types of heat preservation load and business load. The share of the heat preservation load is small, and the influence on the load rate is small. The business load is the part which plays a decisive role in the load rate, and various types of users have certain differences due to different industry characteristics and use time. The load of business and financial users is relatively stable after business is started every day, and no obvious valley appears before business is ended; the service users, especially the catering users, are mostly concentrated in the noon and the evening, and basically keep synchronous with the load of the resident users. The load of the commercial service user is not large in share, but the load rate is not influenced negligibly because the peak time of the commercial service user is just overlapped with the peak time of the system load. In addition, the business and service industry increases business hours in holidays due to business behaviors, and the power load of the business and service industry is often one of important factors influencing holiday load characteristics. The loads of commercial service users are more secondary loads, and in order to meet the requirement of power supply reliability, a standby power supply is generally installed.
5) culture and entertainment
the load characteristics of the cultural entertainment users are similar to those of the service users, the load use time is concentrated in the noon or the evening, and the seasonal fluctuation of the load is obvious. The total load of the cultural entertainment users is not large, but the holiday load is usually high due to the influence of service time. The cultural entertainment users have special requirements on power supply reliability for avoiding safety accidents related to personal casualties due to a plurality of service objects. Some important movie theaters, meeting centers, large-scale entertainment places and the like generally require secondary loads, and power failure is not allowed to occur under the normal operation condition.
6) Sports category
The loads of sports users are mainly lighting loads and air conditioning loads, basically no power loads are provided, and the sports users can rarely run at full load except for major events, so that the maximum load utilization hours are very low, and the loads in each month have obvious imbalance. Because the total load is small, the sports user has little influence on the load characteristic of the power grid, and the requirement on the power supply reliability is not very outstanding.
7) education and scientific research
The educational scientific research users comprise higher schools, middle and primary schools, scientific research institutions, survey design institutions, training institutions and the like. The load of big, middle and small schools in the period of chills and sunstroke and the load of festivals and holidays are generally lower under the influence of the work and rest time. The load scale of education and scientific research users is only large for higher schools and scientific research institutions, and the influence of the load characteristics on the power distribution network is mostly not very obvious due to the small load total scale. The requirements of educational and scientific research users on power supply reliability are different, wherein important laboratory power supplies such as biological products, culture medium power consumption and the like are generally classified into a class of loads.
8) Medical and hygienic
The average daily load rate of each month of the medical and health users has little difference and small annual fluctuation. Most users adopt a 24h work system mode, but are influenced by work and rest rules, and the load at night is obviously lower than that during the day. Medical and health users have different requirements on power supply reliability, and generally, the higher the hospital grade is, the higher the requirement on the power supply reliability is.
(2) density of load
The load density is a quantitative parameter for representing the density of load distribution, and is an average power utilization value per square kilometer. The region with high load density is selected for peak-shifting regulation, so that the load regulation can be greatly performed. However, in some special cases, for example, in a city center, the electric load density in the city center is high, and if the peak-shifting effect alone is considered, the selection area is a good peak-shifting selection area, but the city center has the characteristics of dense population, centralized administration, economy, commerce, traffic and the like, and the requirements on power supply quality and reliability are high, and generally, the peak-shifting regulation and control cannot be preferentially performed on the city center.
(3) peak power consumption ratio
The peak power consumption ratio refers to the proportion of the power consumption in the peak period to the power consumption in the whole day in the typical load curve of the working day of the user. The higher the peak power utilization ratio, the more concentrated the power utilization period of the user, and the better the peak-shifting regulation effect.
(4) Peak offset cost
The peak staggering cost refers to the economic loss generated when the load is subjected to peak staggering regulation. In terms of peak staggering cost alone, the priority level of peak staggering regulation and control of areas, groups and users with low peak staggering cost can be properly improved within a certain range.
(5) Load factor and annual maximum load utilization hours
The load rate is the percentage of the average load to the maximum load over a specified time (day, month, year). The user load rate and the annual maximum load utilization hours are one of important indexes for measuring the user grade. For example, for users of industrial type and the like, the load factor is high, the load factor of a part of three-shift system continuous operation enterprise even reaches more than 95%, the annual maximum load utilization hours reaches more than 6000h, and the requirement on power supply reliability is extremely high. On the contrary, some users (such as a gym) can rarely run at full load except for a few moments, the load rate and the annual maximum load utilization hours are very low, and even if the power distribution network has a local fault, the reliable power supply of the users cannot be influenced.
the invention utilizes a Bayes classification method to establish the correlation between the user behavior pattern and the characteristic quantity (temperature and day type), and calculates the base line load of the forecast day on the basis.
The calculation flow of the predicted daily baseline load is shown in fig. 1, and it can be known from the figure that the calculation process is divided into three stages:
(1) Preparation working phase
Using correlation analysis to identify whether any two given feature quantities are statistically correlated, excluding redundant feature quantities:
wherein r represents a correlation coefficient, yiAn observed value, z, representing the characteristic quantity y on the i-th dayirepresents the observed value of the characteristic amount z on the i-th day,Represents the average value of the characteristic amount y over all observation days,Represents the average value of the characteristic amount z on all observation days.
If the correlation coefficient r is greater than 0.6, the strong correlation between the two characteristic quantities is represented, and one of the characteristic quantities is deleted.
and forming a prediction tuple X { X1, X2, …, xi } according to the user characteristic quantity measurement values of the prediction day, and dividing the user characteristic quantity measurement values of the observation day into a plurality of training tuples, so that the training tuples and the associated class labels form a training set D.
(2) A classifier training stage:
Calculating the prior probability P (C) of each classi) 1, 2.. m, if the prior probability of a class is unknown, then P (C) is assumed1)=P(C2)=...=P(Cm) Otherwise:
P(Ci)=|ci,D|/|D|
wherein, | ci,DI represents the training set ciThe number of training tuples of class, | D | represents the total number of training tuples in the training set, P (c)i) Represents a training set ciA priori probability of a class.
Assuming that classes are independent from each other, given class labels of tuples, and assuming that feature quantities are independent from each other, i.e. there is no dependency relationship between feature quantities, there are:
in the formula, xkThe value representing the kth feature quantity of the tuple X, for each feature quantity, whether it is a discrete value or a continuous value is considered:
If it is a discrete value:
In the formula (I), the compound is shown in the specification,Represents a training set ciClass group X value of kth feature quantity.
If continuous, it is generally assumed that the characteristic magnitudes follow a gaussian distribution:
In the formula (I), the compound is shown in the specification,AndAre respectively ciMean and standard deviation of the values of the kth attribute of the class training tuple.
(3) An application stage:
According to Bayesian force:
In the formula, P (c)i| X) represents a training set ciBayesian probability of class, P (X) denotes prior probability of predicting tuple, P (X | C)i) Represents a training set cithe inverse probability of a class.
Calculating P (c)i| X), then the baseline load for the predicted day is:
In the formula, cs(t) represents the baseline load on the predicted day, ci(t) represents a training set ciThe load of the class.
The traditional Bayesian classification method only selects P (X | C)i)P(Ci) The largest category is used as the category label of the prediction day, and all categories are considered together in the text, so that the situation that the baseline load of the prediction day is single is avoided.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.