CN110570090A

CN110570090A - Load identification method of baseline load model based on Bayesian classification method

Info

Publication number: CN110570090A
Application number: CN201910738256.5A
Authority: CN
Inventors: 庞天宇; 解梁军; 郭乃网; 宋岩; 沈泉江; 陈睿; 杨栋; 陈开能; 吴元庆
Original assignee: Xinghuan Information Technology (shanghai) Co Ltd; State Grid Shanghai Electric Power Co Ltd; East China Power Test and Research Institute Co Ltd
Current assignee: Xinghuan Information Technology (shanghai) Co Ltd; State Grid Shanghai Electric Power Co Ltd; East China Power Test and Research Institute Co Ltd
Priority date: 2019-08-12
Filing date: 2019-08-12
Publication date: 2019-12-13

Abstract

the invention relates to a load identification method of a baseline load model based on a Bayesian classification method, which comprises the following steps: step 1: determining whether any two given user characteristic quantities are statistically related, and excluding redundant characteristic quantities; step 2: forming a training set by the user characteristic quantities which are judged to be statistically correlated and have redundant characteristic quantities eliminated; and step 3: calculating prior probability aiming at each data category in the training set and further obtaining corresponding Bayesian probability; and 4, step 4: calculating by using Bayesian probability and combining with the electrical load of the forecast day to obtain the baseline load of the forecast day; and 5: and identifying the peak-error potential load by using the obtained baseline load. Compared with the prior art, the method has the advantages of avoiding single load of the predicted daily base line, being insufficient for identifying the peak-staggering potential load, being strong in data diversity and the like.

Description

Load identification method of baseline load model based on Bayesian classification method

Technical Field

The invention relates to the technical field of power grid peak shifting scheduling, in particular to a load identification method of a baseline load model based on a Bayesian classification method.

Background

At present, the unbalance of the power supply and demand of China is mainly expressed as structural power shortage and time-interval power shortage, the peak load time is smaller, the traditional measures such as increasing installed capacity easily cause the increase of power grid investment and the waste of the installed capacity at the time of low load peak, and in order to solve the problems of insufficient peak load supply and time-interval power shortage, a power enterprise must make off-peak scheduling work, so that limited energy resources can play the maximum effect. The off-peak scheduling is still a normalized management work for eliminating the gaps of supply and demand, controlling the demand of power consumption and maintaining the order of supply and power consumption to be stable.

Along with the aggravation of power supply and demand contradictions and the rapid change of a user power utilization structure, the power load characteristics of various regions are greatly changed, for example, the problems of continuous and rapid increase of maximum load, continuous expansion of peak-valley difference, continuous reduction of load rate and annual maximum load utilization hours, increasingly sharp supply and demand contradictions in a peak period and the like cause increasing difficulty in power grid peak regulation, so that the phenomena of gate-off electricity limitation and peak-crossing electricity utilization of various regions across the country frequently occur, the power grid fault rate is increased, meanwhile, the requirements of electricity utilization on power supply reliability, the quality of electric energy and service of power supply industry are continuously improved, great threats are brought to safe and stable economic operation of a power system, and a great deal of difficulties are brought to power market analysis, prediction, planning and marketing work. On the other hand, as the progress of the electric power reform in China is slow, basic works such as electric power market research, load characteristic analysis, electric power load optimization control research and the like which are more and more important for the operation and planning development of electric power enterprises are not sufficiently and effectively developed, a benign interaction mechanism is not formed between the power supply enterprise and the power utilization client, effective cooperation and communication are not formed between the power supply enterprise and the power utilization client, and certain difficulties are caused to the operation of an electric power system and the daily management of the power supply enterprise. Therefore, in the face of the trend of the change of the power load characteristics of the power grid, how to fully research the load characteristics of each power consumer becomes an important subject before the power consumers to fully explore the potential of the power consumers in the aspects of peak load shifting, valley filling, power balance and the like according to the power characteristics of each power consumer.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a load identification method of a baseline load model based on a Bayesian classification method.

The purpose of the invention can be realized by the following technical scheme:

A load identification method of a base line load model based on a Bayesian classification method comprises the following steps:

Step 1: determining whether any two given user characteristic quantities are statistically related, and excluding redundant characteristic quantities;

Step 2: forming a training set by the user characteristic quantities which are judged to be statistically correlated and have redundant characteristic quantities eliminated;

And step 3: calculating prior probability aiming at each data category in the training set and further obtaining corresponding Bayesian probability;

and 4, step 4: calculating by using Bayesian probability and combining with the electrical load of the forecast day to obtain the baseline load of the forecast day;

And 5: and identifying the peak-error potential load by using the obtained baseline load.

further, the step 1 specifically includes: and identifying whether any two given characteristic quantities are statistically correlated by using correlation analysis, excluding redundant characteristic quantities, and if the correlation coefficient corresponding to the correlation analysis is greater than a set value, indicating that the correlation degree between the two characteristic quantities needs to delete one of the characteristic quantities.

Further, the formula for calculating the correlation coefficient is as follows:

Wherein r represents a correlation coefficient, y_iAn observed value, z, representing the characteristic quantity y on the i-th day_iRepresents the observed value of the characteristic amount z on the i-th day,Represents the average value of the characteristic amount y over all observation days,Represents the average value of the characteristic amount z on all observation days.

Further, the step 2 specifically includes: and forming a prediction tuple X { X1, X2, …, xi } according to the user characteristic quantity measurement values of the prediction day, and dividing the user characteristic quantity measurement values of the observation day into a plurality of training tuples, so that the training tuples and the associated class labels form a training set D.

Further, the formula for calculating the prior probability in step 3 is as follows:

P(C_i)＝|c_i,D|/|D|

Wherein, | c_i,Di represents the training set c_iThe number of training tuples of class, | D | represents the total number of training tuples in the training set, P (c)_i) Represents a training set c_ia priori probability of a class.

Further, the bayesian probability in step 3 is calculated as:

In the formula, P (c)_i| X) represents a training set c_ibayesian probability of class, P (X) denotes prior probability of predicting tuple, P (X | C)_i) Represents a training set c_iThe inverse probability of a class.

Further, the calculation formula of the baseline load of the prediction day in the step 4 is as follows:

In the formula, c_s(t) represents the baseline load on the predicted day, c_i(t) represents a training set c_ithe load of the class.

Further, the user feature quantity in step 1 is composed of a distribution power peak shifting right influence element, and the distribution power peak shifting right influence element includes: the power supply reliability requirement grade, load density, peak power consumption ratio, peak-to-peak cost, load rate and annual maximum load utilization hours.

Compared with the prior art, the invention has the following advantages:

(1) The method comprises the following steps of 1: determining whether any two given user characteristic quantities are statistically related, and excluding redundant characteristic quantities; step 2: forming a training set by the user characteristic quantities which are judged to be statistically correlated and have redundant characteristic quantities eliminated; and step 3: calculating prior probability aiming at each data category in the training set and further obtaining corresponding Bayesian probability; and 4, step 4: calculating by using Bayesian probability and combining with the electrical load of the forecast day to obtain the baseline load of the forecast day; and 5: and identifying the peak-shifting potential load by using the obtained baseline load so as to avoid the defect that the predicted daily baseline load is single and is not enough for identifying the peak-shifting potential load.

(2) The method of the invention comprises the following steps that the user characteristic quantity consists of power distribution and power utilization peak shifting right influence elements, and the power distribution and power utilization peak shifting right influence elements comprise: the method has the advantages that the power supply reliability requirement grade, the load density, the peak power consumption ratio, the peak-off cost, the load rate and the annual maximum load utilization hours are required, and finally the diversity of the baseline load data obtained by the method is good.

drawings

FIG. 1 is a flow chart of the method of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.

the invention is introduced by the relevant basic principle;

The distribution power peak staggering potential influence factors comprise:

(1) Required level of power supply reliability

the power supply reliability refers to the capability of a power system for continuously supplying power, and the requirement on the power supply reliability has a relatively large influence on peak shifting regulation and control, so that the power supply reliability is an important influence factor in peak shifting potential evaluation. The following lists the analysis of power consumption behavior and the requirement for power supply reliability in various industries.

1) light industries

The light industry includes multiple industries such as textile, paper making, bicycles, tobacco, food and the like, the load characteristic of users has larger difference, and the requirements on power supply reliability are not completely the same. The textile industry and other industries basically run continuously in three shifts, the load factor is high, the power supply voltage level reaches 10kV or even higher in large and medium-sized textile factories, the economic loss caused by power failure is large, and the requirement on power supply reliability is high.

2) Heavy industry class

The majority of heavy industry users are continuous production type enterprises, including industries such as metallurgy, machinery, energy (electric power, petroleum, coal, natural gas and the like), chemistry, building materials and the like, and the load rate is high and is basically stable all the year round. Compared with other types of users, the heavy industry type users have the advantages of high power consumption, large scale and higher voltage level of the accessed power grid. According to investigation, a great number of users in the heavy industry are primary or secondary loads, and short-time power failure can not only cause great economic loss, but also even cause personal casualty accidents, so that the requirement on power supply reliability is extremely high.

3) administrative office class

administrative office class load is mainly for the power consumption load such as the illumination of office, group, enterprise and public institution, air conditioner, office equipment, and the load characteristic is strong along with the regularity of the change of work and rest time, and is relatively steady after work every day, and no obvious valley generally appears before work. The administrative office user load is only 1/3 of the time per day over 50% of the maximum load, and holidays are substantially inactive, so the load rate level is low. The administrative office load has obvious seasonal fluctuation and is very sensitive to temperature and climate change, but the administrative office load occupies a small proportion of the total load of a power system and is generally used only in the daytime, so that the administrative office load has little influence on the load rate of a power grid. The electricity consumption scale of administrative office users is generally not large, the power supply voltage level is mostly 0.4kV, and the requirements on power supply reliability are possibly very different due to different user properties.

4) class of business services

the load of the commercial service user is mainly the electricity load such as lighting, air conditioning, power, etc., and can be roughly divided into 2 types of heat preservation load and business load. The share of the heat preservation load is small, and the influence on the load rate is small. The business load is the part which plays a decisive role in the load rate, and various types of users have certain differences due to different industry characteristics and use time. The load of business and financial users is relatively stable after business is started every day, and no obvious valley appears before business is ended; the service users, especially the catering users, are mostly concentrated in the noon and the evening, and basically keep synchronous with the load of the resident users. The load of the commercial service user is not large in share, but the load rate is not influenced negligibly because the peak time of the commercial service user is just overlapped with the peak time of the system load. In addition, the business and service industry increases business hours in holidays due to business behaviors, and the power load of the business and service industry is often one of important factors influencing holiday load characteristics. The loads of commercial service users are more secondary loads, and in order to meet the requirement of power supply reliability, a standby power supply is generally installed.

5) culture and entertainment

the load characteristics of the cultural entertainment users are similar to those of the service users, the load use time is concentrated in the noon or the evening, and the seasonal fluctuation of the load is obvious. The total load of the cultural entertainment users is not large, but the holiday load is usually high due to the influence of service time. The cultural entertainment users have special requirements on power supply reliability for avoiding safety accidents related to personal casualties due to a plurality of service objects. Some important movie theaters, meeting centers, large-scale entertainment places and the like generally require secondary loads, and power failure is not allowed to occur under the normal operation condition.

6) Sports category

The loads of sports users are mainly lighting loads and air conditioning loads, basically no power loads are provided, and the sports users can rarely run at full load except for major events, so that the maximum load utilization hours are very low, and the loads in each month have obvious imbalance. Because the total load is small, the sports user has little influence on the load characteristic of the power grid, and the requirement on the power supply reliability is not very outstanding.

7) education and scientific research

The educational scientific research users comprise higher schools, middle and primary schools, scientific research institutions, survey design institutions, training institutions and the like. The load of big, middle and small schools in the period of chills and sunstroke and the load of festivals and holidays are generally lower under the influence of the work and rest time. The load scale of education and scientific research users is only large for higher schools and scientific research institutions, and the influence of the load characteristics on the power distribution network is mostly not very obvious due to the small load total scale. The requirements of educational and scientific research users on power supply reliability are different, wherein important laboratory power supplies such as biological products, culture medium power consumption and the like are generally classified into a class of loads.

8) Medical and hygienic

The average daily load rate of each month of the medical and health users has little difference and small annual fluctuation. Most users adopt a 24h work system mode, but are influenced by work and rest rules, and the load at night is obviously lower than that during the day. Medical and health users have different requirements on power supply reliability, and generally, the higher the hospital grade is, the higher the requirement on the power supply reliability is.

(2) density of load

The load density is a quantitative parameter for representing the density of load distribution, and is an average power utilization value per square kilometer. The region with high load density is selected for peak-shifting regulation, so that the load regulation can be greatly performed. However, in some special cases, for example, in a city center, the electric load density in the city center is high, and if the peak-shifting effect alone is considered, the selection area is a good peak-shifting selection area, but the city center has the characteristics of dense population, centralized administration, economy, commerce, traffic and the like, and the requirements on power supply quality and reliability are high, and generally, the peak-shifting regulation and control cannot be preferentially performed on the city center.

(3) peak power consumption ratio

The peak power consumption ratio refers to the proportion of the power consumption in the peak period to the power consumption in the whole day in the typical load curve of the working day of the user. The higher the peak power utilization ratio, the more concentrated the power utilization period of the user, and the better the peak-shifting regulation effect.

(4) Peak offset cost

The peak staggering cost refers to the economic loss generated when the load is subjected to peak staggering regulation. In terms of peak staggering cost alone, the priority level of peak staggering regulation and control of areas, groups and users with low peak staggering cost can be properly improved within a certain range.

(5) Load factor and annual maximum load utilization hours

The load rate is the percentage of the average load to the maximum load over a specified time (day, month, year). The user load rate and the annual maximum load utilization hours are one of important indexes for measuring the user grade. For example, for users of industrial type and the like, the load factor is high, the load factor of a part of three-shift system continuous operation enterprise even reaches more than 95%, the annual maximum load utilization hours reaches more than 6000h, and the requirement on power supply reliability is extremely high. On the contrary, some users (such as a gym) can rarely run at full load except for a few moments, the load rate and the annual maximum load utilization hours are very low, and even if the power distribution network has a local fault, the reliable power supply of the users cannot be influenced.

the invention utilizes a Bayes classification method to establish the correlation between the user behavior pattern and the characteristic quantity (temperature and day type), and calculates the base line load of the forecast day on the basis.

The calculation flow of the predicted daily baseline load is shown in fig. 1, and it can be known from the figure that the calculation process is divided into three stages:

(1) Preparation working phase

Using correlation analysis to identify whether any two given feature quantities are statistically correlated, excluding redundant feature quantities:

If the correlation coefficient r is greater than 0.6, the strong correlation between the two characteristic quantities is represented, and one of the characteristic quantities is deleted.

and forming a prediction tuple X { X1, X2, …, xi } according to the user characteristic quantity measurement values of the prediction day, and dividing the user characteristic quantity measurement values of the observation day into a plurality of training tuples, so that the training tuples and the associated class labels form a training set D.

(2) A classifier training stage:

Calculating the prior probability P (C) of each class_i) 1, 2.. m, if the prior probability of a class is unknown, then P (C) is assumed₁)＝P(C₂)＝...＝P(C_m) Otherwise:

P(C_i)＝|c_i,D|/|D|

Assuming that classes are independent from each other, given class labels of tuples, and assuming that feature quantities are independent from each other, i.e. there is no dependency relationship between feature quantities, there are:

in the formula, x_kThe value representing the kth feature quantity of the tuple X, for each feature quantity, whether it is a discrete value or a continuous value is considered:

If it is a discrete value:

In the formula (I), the compound is shown in the specification,Represents a training set c_iClass group X value of kth feature quantity.

If continuous, it is generally assumed that the characteristic magnitudes follow a gaussian distribution:

In the formula (I), the compound is shown in the specification,AndAre respectively c_iMean and standard deviation of the values of the kth attribute of the class training tuple.

(3) An application stage:

According to Bayesian force:

Calculating P (c)_i| X), then the baseline load for the predicted day is:

The traditional Bayesian classification method only selects P (X | C)_i)P(C_i) The largest category is used as the category label of the prediction day, and all categories are considered together in the text, so that the situation that the baseline load of the prediction day is single is avoided.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A load identification method of a base line load model based on a Bayesian classification method is characterized by comprising the following steps:

2. The method for load recognition based on the baseline load model of the bayesian classification method as claimed in claim 1, wherein the step 1 specifically comprises: and identifying whether any two given characteristic quantities are statistically correlated by using correlation analysis, excluding redundant characteristic quantities, and if the correlation coefficient corresponding to the correlation analysis is greater than a set value, indicating that the correlation degree between the two characteristic quantities needs to delete one of the characteristic quantities.

3. the method for load recognition based on the baseline load model of the Bayesian classification method as claimed in claim 2, wherein the correlation coefficient is calculated by the following formula:

4. the method for load recognition based on the baseline load model of the bayesian classification method as claimed in claim 1, wherein said step 2 specifically comprises: and forming a prediction tuple X { X1, X2, …, xi } according to the user characteristic quantity measurement values of the prediction day, and dividing the user characteristic quantity measurement values of the observation day into a plurality of training tuples, so that the training tuples and the associated class labels form a training set D.

5. the method for load recognition based on the baseline load model of bayesian classification as claimed in claim 1, wherein the prior probability in step 3 is calculated as:

P(C_i)＝|c_i,D|/|D|

6. The method for load recognition based on the baseline load model of bayesian classification as claimed in claim 1, wherein the bayesian probability in step 3 is calculated as:

7. The method for load recognition based on the baseline load model of bayesian classification as claimed in claim 1, wherein the calculation formula of the baseline load of the prediction day in step 4 is:

8. the method as claimed in claim 1, wherein the user feature quantity in step 1 is composed of power distribution peak error right influencing elements, and the power distribution peak error right influencing elements include: the power supply reliability requirement grade, load density, peak power consumption ratio, peak-to-peak cost, load rate and annual maximum load utilization hours.